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A SAMPLING PROCEDURE FOR MAILED QUESTIONNAIRES 


M. A, E1t-Baprr 
Cairo University 


INTRODUCTION 


HE mailed questionnaire is desirable in surveys because it is economical of 

money and effort, free from the interviewer’s effect on the respondent, and 
able to reach individuals not accessible by visit or telephone. However, it has 
several drawbacks which can seriously limit its efficiency as a means of gather- 
ing reliable data. The mailed answers are often exposed to two kinds ci bias: 
(1) bias arising from the respondents’ misunderstanding the questions, resent- 
ment of interference in their personal affairs, or falsification for reasons con- 
nected with the subject of the survey; (2) bias of non-response, which arises 
from differences in the characteristic under investigation between respondents 
and non-respondents. The latter bias is often more pronounced in mailed ques- 
tionnaires than in personal interviews and its reduction to acceptable limits will 
always necessitate extra effort, time and skill. 

The problem of non-response has already been tackled from various angles. 
Hansen and Hurwitz [4] studied a survey which combines the advantages of 
mailed questionnaires and personal interviews. The plan first utilizes the 
economies involved in the use of questionnaires by mailing them to a sample of 
the population under investigation. A follow-up is then carried out by inter- 
viewing a subsample of the non-respondents to the mail canvass, thus eliminat- 
ing substantially the bias of non-response in the first stage. The optimum allo- 
cation of the mail and field samples is obtained by requiring minimum cost for 
an assigned precision. The sizes of the two samples naturally depend on the 
rate of non-response to the mailed questionnaire and the variances of the char- 
acteristic under investigation both among the whole population and among the 
non-respondents. These parameters are assumed to be known from previous 
experience. 

Politz and Simmons [5] devised a technique for circumventing the need for 
call-backs in interviewing surveys. Each person in the sample is visited once, 
at a time determined at random. From each person interviewed information is 
obtained as to whether he was at home at certain times, which leads to an 
estimate of the proportion of the time he is at home during the interviewing 
hours. The sample estimate is produced by weighting the results for each group 
that spends the same proportion of time at home during the interviewing hours 
by the reciprocal of the estimated percentage of time that the members of this 
group are at home. 


209 
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Birnbaum and Sirken [1, 2] gave several results pertinent to a survey which 
aims at estimating the propertion having a certain characteristic in a popula- 
tion that has a known rate of non-response. They guve the minimum sample 
size which guarantees a total error in the estimate not exceeding a prescribed 
value with a certain risk, despite the presence of the error of non-response. 
They also gave the expected cost which guarantees a stated upper limit of error 
with the desired risk when the survey involves a number of call-backs on all 
the non-respondents. The rates of non-response in each of the call-backs are 
supposed to be known. The authors finally illustrated a procedure for finding by 
trial the optimum number of call-backs which minimizes the cost. 

Deming [3] studied a probability mechanism for dealing with non-response 
by dividing the population into six classes according to the average number of 
interviews completed out of 8 attempts. The six classes comprise the individuals 
with whom the average number of successful interviews out of 8 attempts is 
0, 1, 2, 4, 6, 8. The plan consists of an initial attempt on a sample drawn from 
the population, a first call-back on a fraction of non-responses to the first 
attempt, and then a number of call-backs each of which is a canvass of the non- 
responses in the preceding attempt. The author gives formulas for the bias of 
the estimate (as measured from the expected value in the classes 1-8) and its 
mean square error. The sampling fraction employed in the first call-back is de- 
termined so as to minimize the mean square error for fixed cost. The optimum 
number of call-backs when precision is fixed is determined by figures on cost. 
The author then carried out numerical calculations in order to study the effect 
of the call-backs and the size of the initial sample on the bias and the mean 
square error of the estimate and also on the cost of the survey. Numerical 
assignments for the following unknowns were made and employed in the calcu- 
lations: (1) The proportions of the population in the classes 0-8; the employed 
proportions were taken from average urban experience. (2) Values of the varia- 
ble under investigation and its standard deviation in each of the classes 1-8. 
(3) The cost per call: this cost was supposed to be the same for all attempts 
after the first. 


PURPOSE AND PLAN OF THIS WORK 


Three important general conclusions may be drawn from the above works 
and froia the existing literature on the general problem of non-response: (1) the 
hazard of putting confidence in the results of a survey that neglects the effect 
of non-response; (2) the inefficiency and futility of complete coverage or of 
resorting to bigger samples as a means of overcoming the error of non-response; 
(3) the usefulness and importance of utilizing past experience about the survey 
or similar surveys in the planning stage. 

One aim of this work is to draw a plan that utilizes the information available 
to the investigator to achieve maximum efficiency; i.e., to secure an estimate 
which is as free as possible from the bias of non-response and which has maxi- 
mum precision for the available cost or ‘he least cost for attaining a required 
precision. The plan is constructed as foliows: 

(1) The economical mailed questionnaires are used to collect the bulk of the 
required data. 
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(2) Successive waves of questionnaires are mailed in successive attempts to 
reach the more stubborn classes of non-respondents. 

(3) This procedure is continued up to a certain stage beyond which the mail 
questionnaires cease to be effective. More effort and cost are then expended in a 
fina] interviewing stage in which individuals are contacted personally in order 
to secure their answers and eliminate the bias of non-response. This final step is 
taken when it is found to be more advantageous than undertaking further mail 
attempts before starting the final interviewing stage. 

(4) A random sample rather than a complete canvass is used whenever 
possible in each stage of the survey: in the initial attempt, in each of the call- 
backs on the non-respondents, and in the final interviewing stage. 

(6) Estimates are calculated from the pooled results of the attempts. 

Another aim of this work is to discuss two important questions which arise in 
the planning stage of a survey involving call-backs. The first is whether it is 
advantageous, in the sense explained above, to call back on samples rather than 
on the whole of the non-respondents, and, in the cases where that is true, what 
sampling fraction is optimum in each attempt. The second question is how far 
the investigator should continue his mail attempts before he resorts to personal 
interviewing to secure the answers of the non-respondents. We will see from the 
discussion that there are many cases where complete coverage of the non- 
respondents is inefficient because one can still attain the same precision for a 
lower cost or a higher precision for the same cost by sampling the non-respond- 
ents at each stage. General conditions will be given to designate the cases 
where the optimum procedure is applicable, as well as formulas for the optimum 
sampling fractions in each attempt, the initial sample size, and the cost of the 
survey. An inequality which tells before starting a certain attempt whether it 
should be performed by mail or by personal interview will also be established. 

A simple model for the population is used here. We assume that under the 
circumstances of the survey the population can be stratified according to which 
attempt succeeds in obtaining the individuals’ answers if complete coverage of 
the population is carried out. In other words, we have a stratum of individuals 
who respond to the first attempt, a second stratum of those who respond to the 
second attempt, and so on. The investigator naturally has no way of separating 
these strata in the planning stage. He gets information about the strata by 
sending the successive waves of questionnaires. Any particular wave will secure 
answers from one and only one stratum: the respondents to that wave. Non- 
response takes place only because that wave has contacted individuals who do 
not belong to the corresponding stratum. The responses to that wave will 
furnish an unbiased estimate of the characteristic under investigation in the 
corresponding stratum. 

The estimate calculated by proper weighting of the results of the attempts 
will thus be unbiased. The only possible source of bias here is the failure to 
secure answers from the comparatively few individuals who cannot be reached 
in the final interviewing stage. The bias of non-response in this class cannot be 
learned from the survey itself. It seems likely, however, that the economies in- 
volved in the use of mail rather than personal interview to collect a major part 
of the required data will enable the investigator to put more money and effort 
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into the final interviewing stage in order to strive for complete coverage. The 
optimum planning of the survey under study will require knowledge of the 
optimum values for the initial sample size, the sampling fraction in each of the 
attempts, and the number of attempts which will lead to maximum precision 
for given cost or minimum cost for a required precision. 

The establishment of this criterion is one of various possible approaches to 
the general problem of planning a survey. Several other approaches, some of 
which are discussed in the above-mentioned papers, have been established. The 
important criterion of minimum mean square error adopted by Deming [3] 
takes into consideration both the bias and the variance of the estimate. The 
bias in Deming’s plan arises from the partial non-response of the classes 1-6. 
As pointed out above, this bias does not exist in the proposed plan where the 
mean square error will be identically equal to the variance. 

The present stratification differs from that applied by Politz-Simmons [5] or 
by Deming [3]. The Politz-Simmons stratification is according to the propor- 
tion of time spent at home during the interviewing hours, which is irrelevant 
in a mailing survey. Deming stratifies the population according to the average 
number of interviews completed successfully out of eight attempts. This tech- 
nique is applied to situations in which the successful completion of the inter- 
view depends upon two factors: (1) whether the interviewer will find the re- 
quired individual at home when he calls, and (2) whether the individual will 
respond when he is interviewed. The first of these factors does not arise in mail 
surveys except for those individuals who cannot be traced by mail. This is why 
the present stratification is done according to the second factor only. The pres- 
ent stratification seems to render it more likely for an investigator who wants 
to utilize past experience in planning the survey to get the required information. 
In a repeated survey or in a survey where the investigator utilizes experience 
with similar surveys it seems much easier to find d»’a about the respondents 
and the non-respondents in each attempt than about the individuals who re- 
spond to a certain proportion of attempts on the average. 

Another difference between the proposed plan and Deming’s is that the latter 
uses a sampling fraction of non-respondents in the second attempt only; the 
following attempts are complete canvasses of non-respondents. The proposed 
plan uses a sample of non-respondents in every attempt, including the final 
interviewing stage, whenever this is possible. 

The proposed plan is clearly an extension of the Hansen-Hurwitz plan [4] 
which involves one mail attempt followed by a personal interview of a sample 
of the non-respondents. 

It should be stated here that the present model is an approximation to 
reality. We assume here that an individual answers after a definite number of 
call-backs are made. There are varicus personal factors which determine 
whether he will answer at the expected call or before or after it; whether he has 
the time, whether he is in the mood, etc. Nevertheless this simple model has 
the advantage of being designed, as mentioned above, so as to make it more 
likely for the investigator to be able to utilize past experience in planning the 
survey. Another advantage of the present plan is that it leads to formulas for 
the initial sample size, the sampling fractions, and the cost, which do not involve 
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the bias of estimation. This will reduce the number of unknowns which the 
investigator has to estimate roughly in the planning stage. 

Another point regarding the effect of the nature of the plan on the collected 
data ought to be mentioned here. The mail survey will presen? different sets 
of stimuli to individuals, and elicit from them somewhat different responses, 
depending upon the number of questionnaires they must receive before they 
respond. Another difference in stimulation and response may be found among 
the final group which will be interviewed personally. The effect of this stimula- 
tion and the correction thereto are beyond the scope of this work which is 
confined to the procedure of collecting the factual data. 


ESTIMATION OF THE POPULATION MEAN 


The survey under consideration aims at estimating the mean value yz or the 
total 7 of a certain characteristic ia a population of N individuals. The popu- 
lation is assumed to be theoretically divided into the following strata: (1) 
Individuals who would respond to the first mail attempt if they are contacted. 
This stratum has a proportion P;, a mean yu; and a variance o;;". (2). Individu- 
als who would need exactly two mail calls before they respond. These have a 
proportion P2;, & mean ps; and a variance o2:’, - - - . (m)—Individuals who 
would need exactly m mail calls to secure their answers. These have a proportion 
Pai, & Mean pm and a variance om;°. (m+1)—Individuals who would not 
respond even after m mail calls. These have a proportion of P,2, & Mean pms 
and & variance om”. 

The proportions of the strata are connected by the relation 


- Pat Pa = 1 
t=—1 
and the variances within the strata are the summations 
1 


n;—1 


p i (Xi; — ui)?. 


The field operation is started by mailing questionnaires to a sample of size 
n, picked at random from the popuiation. This step will lead to n,; responses 
and nz non-responses. The mean value of the characteristic under investigation 
among the m3; respondents will be X1:. A second sample of size ky. is then 
taken from the non-respondents and re-contacted in a second mail attempt, 
giving ne; responses whose mean in X2; and nz2 non-responses. From the latter 
group, kytz. are chosen at random and followed up by mail; and so on for m 
mail attempts. The (m+1)th and final step in the field research is to interview 
a random sample of size wn»: chosen at random from the n,: non-respondents 
to the mth mail attempt. This final sample will have a mean X ¢m43)1. 

This procedure leads, after appropriate weighting, to an unbiased estimate 
of the mean or the total of the population. The absence of bias is due to the 
fact that since each call-back will secure a random sample of readings from one 
particular stratum, the expected values of the readings obtained in that call- 
back will all be equal to the mean value of the corresponding stratum. The 
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following diseussion is confined to the estimation of the population mean, but 
the adaptation to the population total is obvious. 

We want to use here an estimate of the population mean which has the 
following properties: (a) linearity, (b) no bias and (¢) minimum variance 
subject to (a) and (b). This estimate will be of the form 


Pernt m1 21 wnme 
Yee) Xw+d >i Xn +---+¢9>D X (m41y1" (2) 


h=' hel h=l 


where c, d,-+-, g are constants and X,;"(i=1, 2, ---, m+1) denotes the 
hth observation in the 7th attempt. 
Now the expected value of (2) is 


EX = Curlin, — dun Ena + euslng + +++ + Stemi EM: + Jm2WE Rms 
= mCPypyn + mdkePayan + mekeksPaius + 


+ ni ( I .) Paistma + mgw( II ks) Prmatimz 


and since this has to be identically equal to y, i.e. to 
Pyun + Payn + Pawn + -- +> + Pritts + Pnsitma, 


the constants in (2) must be: 


Lk 1 


e erent 


; mh ke kok 


Substituting in (2) we get the following formula for the best estimate of y: 


oe 1 yea Noy n ie i Net =. 
xX = sis eae + = Xx +27 Xu + “ne -+ “ jas Xm + oe 
2 


ny o 


cok [lk ITF: 


We now proceed to calculate the variance of X¥. The following procedure is 
perhaps shorter than the straightforward method. 

Let us first introduce some parameters of the non-responding groups to each 
of the m attempts. We have already subdivided the population into strata of 
prospect respondents to certain attempts; the ith stratum (i=1, 2, ---, m) 
comprises the respondents to the ith attempt. According to the notation 
adopted before, this stratum has a proportion P;, of the total population, a 
mean p,; and a variance o,;*. Now corresponding to the ith stratum we have a 
well defined group comprising all the members of the population who would 
not respond even after 7 mail calls. Obviously this group involves all the strata 
from the (i+1)th through the (m+ 1)th. We are going to denote the proportion 
of the total population in that group by Pi. where, clearly 


Pa = Posy + Pose 
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We will also denote the mean and the variance in that group by pi and oj" 
respectively. 
Now the estimate (3) can be put in the form: 


1 
= ~ | (mun + m2X2) + (Xu + z Xn — naXu) 


+ (= i Xu pe lak, a Big - =n) ose — (X m+inn — Ams) | » 
2s ( Il fs) 


l 
= = {mX, + my(X2. — Xi) + <= (Rs — Xn) +-- 


ny 


n 


+ A (X may us Xnz) ’ 
(iv) 
2 


where X;. ({=1, 2, - --, m) is the mean of the non-responding group in the 
ith sample and X,, is the mean of the whole ith sample, including both respond- 
ents and non-respondents. Hence 


anes 1 12 
var X = Bm —p) + = (Xe. — Xu) + — (Xs. - Xx) 
2 


m4 


Nm 


+ — (Re — Xn) + - erumunerveth ool) to Yas) | 
23 (ite) II ‘) 
2 


1 { 
3 | Bh. — p)?+ Em2*(Xo. a Xu)? + Ene:*(Xs3. “-- Xe)? + 


ny 


1 
| waives : Enmne?(X (m4) — Kes)? |. 
(11) | 
2 
The product terms vanish because the non-respondents of the ith attempt 


constitute the population sampled in the 7+ 1st attempt. 
Now 


| 
/ 


Ene*(X cist). ong X ie)? wed Eng*(X ci41). — pa)* — Ena(X 2 — pia)’, 
the expected value of the product term vanishing as before. But 
Eng?(X ci4r). — pe)? = E\ ng? BX cis). - pir)*} 


= E4ni2’ 
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i2 Ni, 


Eng*(X ie — pa)? = E{ nw*Bag(Xee = bin)?} - B tna’ 
where E,,, denotes the expectation when nj is known. Hence 


— - 1 é 1 
Enia*(X ci41). — Xn)? = (-- = 1) oa'Ens = m( IT)( nes 1) P 202". 





Kiss Kiss 
Therefore, 
var X = a ide + (- - 1) Pao? + =: - i) P2022? 
m\ N ke ke\ ks 
+ ha = 1)rae +2 Sega 
koks \k w ) 


m \ 4 
(11 ks) ( / 
2 4 
where o? is the population variance. 
Denote by @& the relative variance of XY with respect to the population 
variance and put s;= Pj2c;2"/o", the above formula (4) then becomes 


8m 





R + 





1 Lf — $5 — 8541 
1 = ae —+ |, (5) 
N ny jul i+1 m | 
Il k; Ww Il k; | 
2 2 
ATTAINING MAXIMUM PRECISION FOR A CERTAIN COST 


We proceed now to calculate the sampling fractions kz, ---, kn, w which 
lead to minimum relative variance ® for a given cost. 

Denote by C> the cost of sending out a questionnaire, by C, the cost of proc- 
essing per questionnaire and by C;, the total cost (interviewing and processing) 
per case of the final group. The cost of the whole investigation which involves 
any number m of mail attempts and a final interviewing of non-respondents 
will be: 


cost = Co(m + kei + kyr + +++ + Km (n—1)2) + Ci(mu +n + Mn+ --- 
+ N m—1)1) + Cowra. (6) 


The expected cost will be given by 


c= mCo41 + kePis + heksP2 + ++ +> + ( II) Pos} 
2 
+ mC; {Pu + kePo + keksP + +: : + (II) Pak 
2 


+ mCs( I ks) wPes 


or 
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m—1 i+l 
C= {(Co + PuCi) + z( II) (CoP j2 + CiP 411) 
j=l \ 2 


- (7) 
> w( II ks) CsP aah ‘ 


Given a certain expected cost C for the whole investigation, we want to find 
the initial sample size n; and the sampling fractions ke, - - - , km, w which mini- 
mize the relative variance ® of the estimate X. 

Eliminating n,; between (5) and (7) we get by differentation 


i+] : M 8; — 8} ) 
(IIs) -> 5 _ j= 1,2,+++m-1 
LC oP j2 + CiP G4iy 








m 2 M Sm 
w k; — ” 
( II ) L CP. m2 
where 


m—l 7+1 m 
M = (Co + Pus) + x( Il ks) (CoP ia + CP G4in) + wo( II ks) C2P «3 
2 


j=l 2 





The constant //L is simply equal to 
Co + Py; 
1 — 8) 


This can easily be shown by putting (8) in the form 
i+1 M 8; — 834 
( II ts) (CoP j2 + CP Gain) = L __ 
. (11s) 


and 


- Ms &m 
w k, ) CP... = — — 
(I ) or ree. 


2 


and adding to get the relation 


m—1 7+1 m 
z( Il ks) (CoP 2 + Cipusin) + wo( Il ks) C2P na 


j=l 2 2 
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which, when compared with M/L as given by (9), reduces to 
M (Cot Puli 
L P 1—- & : 
Substituting for M/L in (8) we finally get for the products of the sampling 
fractions: 
( Il ks) 4 See Co t+ Pui 
2 1 — % CoP ia + CiP s4iy 
and + (12) 


_ 7 Sm Cot Puli 
«2 k; = 
: ( IT ) |e ee 


and consequently the sampling fractions when m>1 will be: 


(11) 





felosaliieda 














ke? = 8 — 82 Co t+ Puls 
, 1— 3 CoP + CiPx 
ic ie Su-1 — S& CoP wa + CiP wi era Bie 
Sua — Sei CoP wty2 + iP ur b. (13) 
and 
w? Sm CoP (m—1)2 + C1P =i 
ye= 





8n—1 — Su C2P m2 


When m= 1 (i.e. when there is only one mail attempt followed by the inter- 
viewing stage), equations (5) and (7) will give for the interviewing sampling 
fraction: 

wf é 81 Co t+ Puls ; (13’) 
1 -— 8 CP 12 





which is obviously equal to the w given by (13) after putting Pox=1 and S)=1. 
The value of w given by (13’) is naturally identical with the interviewing sam- 
pling fraction given by Hansen and Hurwitz in [4] 

The optimum sample size will be obtained by carrying out the appropriate 
substitutions in (5). We then get, for any number of mail atte..:pts: 


C P r 1/2 
— o(——) [ta — 8:)(Co + PuCs)} "2 


1- 8 
m—1 (14) 
+ SY { (CoP ip + CrP issn) (8; — 8541) }”? + {CrP nade} | 
j=l 
and the relative variance of the estimate thus obtained is given by 
1 
a+—= c| {a — 3)(Co + Pu} 

: (15) 


+E {(CoPa+ CP snas)(oy— 4} + (CaPaaba} | 
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Instead of specifying the cost and setting forth to calculate the initial sample 
size and the sampling fractions which give minimum variance of the estimate 
we can start by adopting a certain relative variance R of the estimate and then 
calculate n;, w, and the k’s which will minimize the cost of the investigation. 
In that case we can easily verify that the sampling fractions will be the same 
as those given above, that the initial sample size is 


ee Ti Co + Pu 
= ome prtvennparemnveinsidanenisantio 1 oa 1/2 
my ( + =) (~ + ~~) | ( 8)(Co+ Pu )} 


(16) 
m—1 
+ z. {(CoP 2 + CP ¢54-1)1) (8; > 8541) } " + {C2P 28m} ua] 


j=l 


and that the expected cost of the investigation will then be 


1\- 
e = (x + ~) [ta — #)(Co+ PuC,)}'" 
= , (17) 
+ D2) {CoP a t+ CrP o+in) (8s — 841) }¥? + {CsPmz8m} 8] ; 


j=l 


The fraction 1/N which is always associated with the relative variance R 
can be omitted from the formulas (15)—(17) provided it is small enough com- 
pared to #. 

It is interesting to notice from (14) and (15) or from (16) and (17) that the 
variance of the estimate resulting from the above procedure, the initial sample 
size required for obtaining this variance, and the total cost of the survey are 
related by the simple formula 


R 1-8 


oR BEN es RES AREE (18) 
C Cot Puls 


n? 


which depends only upon the probability of response in the first attempt and 
the relative variance of the non-respondents in that attempt. This formula is 
equivalent to 
mR Total cost of the survey 
1—s, Cost of the first attempt 





(18") 


CONDITIONS FOR THE APPLICABILITY OF THE ABOVE PROCEDURE 


The optimum sampling fractions have been obtained; mathematically 
speaking, by applying a set of weights to the returns of the m+1 attempts and 
then minimizing the variance of the estimate for fixed cost or minimizing the 
cost for fixed variance. However, the solution does not guarantee a fundamental 
requirement of these fractions, namely that each should be less than or equal 
to unity. The fact that this will not always be satisfied in general terms can 
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be realized from the dependence of the sampling fractions on the costs Co, C; 
and C, as well as on the construction of the population. 
Going back to equations (13) we find that the sampling fractions in the mail 
attempts will be less than or equal to unity if 
l— 4 > Sp.4 —~ 8j; — 8j41 
Cot PuCi CoP yet GPa CoP2 + GP usin (19a) 
j =2,3,---m—1 








and that the interviewing sampling fraction w will satisfy the same requirement 
if 


8m—1 — 8m Sm 











2 (19b) 
CoP (m—)2 + CiP m CoP ma 
which can be put in the form 
2 Y Y 
o (m—1)2 > Co + C; 4. (1 S 2) Pro (19b’) 
Om2” C2 C; P (m—2)2 


The satisfaction of (18) is of course a necessary condition for the applica- 
bility of a sampling procedure that employs a sample of the non-respondents 
in each call-back. 


SOME REMARKS ABOUT THE SAMPLING FRACTIONS 


We are going to discuss here some of the properties of the sampling fractions 
given by (13) and the impact of these properties on the design of the survey. 
We will also discuss what kind of information is required before starting an 
attempt in order to estimate the sampling fraction which ought to be used in 
that attempt. 

One will notice first that all the sampling fractions are independent of the 
size of the population and that all except kz (and w in the case of one mail at- 
terapt) are independent of the value of the population variance o’. Even k, and 
w i the case of one mail attempt do not involve o* but rather a relative 
va vance with respect to o*. 

We will also notice that these fractions do not involve the initial sample size 
n, or the prescribed precision or cost. This is a direct result of the fact that these 
fractions were obtained by a procedure which is equivalent to minimizing the 
product cost X variance regardless of n;. This important property means that 
if the information required for calculating n, from (14) or (16) is not available 
at the planning stage of the survey and if any value of n, is therefore guessed 
and applied, then the procedure explained above which employs the fractions 
(13) will still be optimum in the sense that it will minimize the product cost X 
variance. That is to say, the resulting estimate will have the least variance 
among all estimates that can be obtained for the expended cost, or the least 
cost among all estimates that have the resulting variance. 

Another important property of the mail sampling fractions (ko, +--+, km) 
is that they do not depend upon the number of mail attempts in the survey. 
This means that the number of mail attempts does not have to be fixed in 
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advance as far as the sampling fractions in the mail attempts are concerned. 
The investigator can continue the mail attempts as long as he has reason to 
believe that this continuation is profitable in terms of cost or precision until 
he finally starts the interviewing stage. The interviewing sampling fraction will 
naturally be affected by the number of the preceding mail attempts. The 
question of when it is most profitable to start the interviewing stage will be 
discussed in the next section. 

The value of the sampling fraction k, which will be used in the uth attempt 

depends upon: 

a) Co, Ci, C2, which are known, 

b) Pw-x2, Pw—12, Paw—y1, Which can be estimated unbiasedly from the re- 
turns of the preceding two attempts, 

c) Py, and Py, namely the expected proportions of the responding and the 
non-responding groups in the attempt which we are planning. Only the 
total of these two proportions, Pyi:+ P.2= Py, can be estimated from 
the results of the preceding attempt. P,,; will have to be estimated either 
from past experience or by considering the proportions of response in the 
preceding attempts. 

d) The relative variances of the non-responding groups in the (u—1)th and 
the uth attempts with respect to the variance of the non-responding 
group in the (w—2)th attempt. Again we have to assume that these 
relative variances can be estimated from experience or by studying the 
results of the preceding attempts. 

The interviewing sampling fraction w is less dependent upon previous 

knowledge. It will be fully determined when o22/¢mm—1)% is known. 


DETERMINATION OF THE OPTIMUM NUMBER OF ATTEMPTS 


There is a certain attempt in the mailing process beyond which it is dis- 
advantageous to continue contacting individuals by mail. This attempt will be 
determined by the fact that when the total cost of the investigation is fixed 
beforehand, the estimate X should have the least possible variance for the 
cost; or, if the precision of the estimate is fixed in advance, the optimum number 
of attempts will be that which minimizes the cost of obtaining an estimate 
having the required precision. 

Equations (15) and (17) will easily show that we can increase the prec sicn 
for fixed cost or reduce the cost of attaining a fixed precision by undertaking the 
(m+1)th mail attempt after the termination of the mth attempt (m=1, 2, - - -) 
as long as the following inequality is satisfied. 


(CoP me + CrP ¢m4syr)!?(8m — 8m4a)!? + (CoP (m4iy28ma1)!? < (CP me8m)'/? (20) 


i.e., as long as 


{1 P m4iy2 oe 
Prana Tm2 





(21) 
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In the cases where om: <¢ m41)2, (19) will be satisfied whenever 





O (m+1)2 2 Pay Pua (1 ¥ Cot+ =). 
Cy 


Tm2 C2 P (m+1)2 


This simple inequality can therefore be used as a substitute for (21) when the 
variance of the non-responding group is increasing from one attempt to the 
next. 

After the mth mail attempt is terminated, inequality (21) is fully determined, 
and can be used to test whether an (m+1)th mail attempt is justified, if we 
can estimate the probability of response in the following attempt (which equals 
P im+1)1/P m2) and the ratio of the variances of the non-responding groups in the 
(m+1)th and the mth attempts. 

Inequality (21) thus furnishes, in the situations where the investigator can 
predict the proportion and the relative variance of the non-responding group 
one and only one stage ahead, a step-by-step test as to whether it would be 
more profitable (in terms of cost or precision) to carry out the next step by 
mail. 

While failure to satisfy (21) is necessary for determining the optimum number 
of mail attempts yet this condition is not sufficient, unless no information about 
the non-responding group beyond the (m+ 1)th is available or predictable. If 
the proportions and the relative variances of the non-responding groups in the 
(m+1)th, (m+2)th, - - - , (m+2>th non-responding groups were available be- 
fore starting the (m+1)th attempt then the investigator may find that, 
although it is more profitable to carry out the interviewing step directly after 
the mth attempt than after the (m+1)th, it is still more profitable to continue 
the mail attempts to a certain stage beyond the (m+1)th. In fact, we can 
easily see from (15) or (17) that one is justified in proceeding with the mail 
call-backs up to the yth attempt (m<ySm-+z2) as long as 


Py Oya y—1 


hy MR 


Pra Tm? j=™ 


f(.— Pee (gan) (OO Pomme 
Pi. je C: C, Pp . 


If we find a value of y which satisfies this inequality then it will be more 
profitable to continue the mail attempts to y than to terminate the mail call- 
backs after the mth attempt. 

The utilization of the inequalities (21) and (23) will be illustrated in the 
following examples. 

Example 1. The following example is based on data obtained from Table I 
of Derning’s paper, [3]. The response rates P; in this table were intended to 
assimilate average urban interviewing experience. The mean values a; of the 
characteristic under investigation have a tenfold variation from class 1 (indi- 
viduals who respond one time out of 8 on the average) to class 8 (individuals 
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who respond 8 times out of 8). The standard deviation within each of the classes 
was assumed to be equal to the mean value in that class. The table is as follows: 


Class 8 6 4 2 1 0 


Proportion P; .30 -25 -20 .10 -10 .05 
Mean value a; 1.00 6 4 2 -l 
Standard deviation o; 1.00 6 4 2 1 


In order to apply these data in our study we are going to assume that the 
above classes 1-8 represent five consecutive responding groups in five consecu- 
tive mail attempts, class 8 being the group that responds to the first mail 
attempt, class 6 the group that responds to the second mail attempt and so on. 
We also have to assume a mean value and a standard deviation for class 0 in 
order to calculate the means and variances of the non-responding groups. We 
are going to assume that ao = 0) = .05. 

From the above table we calculate the variances of the non-responding groups 
in each attempt and then construct the following table which furnishes the 
data required for determining how far we should continue the mail attempts 
and for calculating the optimum sampling fractions, the initial sample size, 
and the cost (or the relative variance). 


Attempt 7 1 2 3 4 5 


Pi .30 25 .20 10 .10 
Pi 70 45 25 15 05 
Tis 47 32 16 .09 05 
o*8;= P 2042" . 1559 .0462 .0061 .0012 .0001 


The variance o? of the whole population is also found to be equal to .5379. 

Suppose now that C)o=$0.10, C:=$0.40 and C,=$4.50. These costs are 
quoted for the Hansen and Hurwitz paper [4]. Note that any proportional 
inerease in these costs since 1946 will not affect the sampling fractions or the 
decision as to when to stop the mail attempts. 

Let us examine whether the investigator is justified in carrying out a second 
mail attempt after the first. Inequality (21) says that a second mai! attempt is 
advantageous if 


f A5 ( =) s { oe { 5 + = 
1-—(—)} > 41 —- —— f-—- - = x — 

VC 70\.47/) 1559) (45 45° .70 

which is true because the left-hand side equals .3162 while the right-hand side 
reduces to .0380 only. 

Similarly, by consecutive application of (21) we easily find that this inequality 
is satisfied for each of the cases m = 1, 2, 3, 4. The fact that the inequality holds 
in each case means that it is to the advantage of the investigator to continue the 
mail attempts even to the fifth attempt. The advantage is that he will acquire 
more precision for the cost or smaller cost of a certain precision. 

The sampling fractions employed in the consecutive attempts are calculated 
by substituting in equations (13). Those equations will give us: 
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ky? = 3716, ks? = .4971, ki? = .2350, ks? = .2653. 


The available data do not enable us to test whether or not the sixth attempt 
should be carried out by mail. If we take it for granted that this attempt will 
constitute the final interviewing stage, the last formula in (13) will give us for 
the square of the interviewing sampling fractions: w* = .0222. 

The optimum planning of the survey will therefore necessitate four mail call- 
backs. In the first call-back (the second attempt) we have to contact 61% of 
the non-respondents to the initial attempt. 

In the following attempts we must call back on 71%, 48%, and 52% of the 
non-respondents consecutively. Fifteen per cent of the non-respondents to the 
final mail attempt are then chosen at random and interviewed personally. 

If the investigator aims at an estimate which has a certain relative variance 
R with respect to the population variance, the initial sample size will then be 
after carrying out the appropriate substitutions in (16), 


m = 1.296R- 
and the expected total cost will be obtained from (17) or (18). Either of the two 
formulas gives 
@ = .52025R-. 
On the other hand if the investigator has a specific fund C for the operation 
then, from (14), 
nm, = 2.4909C 
and the relative variance of the resulting estimate will be, from (15) or (18), 
R = .52025C-". ° 


The following table is constructed in order to compare between the results 
of this optimum procedure and the shorter procedures where the mail call-backs 
are stopped and the final interviewing stage is started directly after a number 
less than 5 of mail attempts have been completed. We will assume that a 
relative variance of the estimated mean equal to one per thousand is required. 
The tabulated results are obtained from formulas (13), (16), and (17). 


Expected 
cost 


No. of mail Initial 
attempts sample 


2428 w=17 $1826 
1796 k,=61, w=19 $ 999 
1423 ke =61, ks =71, w=13 $ 628 
1333 ke =61, ks =71, kg = 48, w=15 $ 552 
1296 ke =61, ky =71, ky =48, ky =52, w= 15 $ 520 


Sampling Fractions (per cent) 


It is interesting to compare the above results with those of the ordinary 
procedure where all the non-respondents are called on in consecutive attempts 
until their answers are obtained. In the present case this procedure will involve 
an initial mail attempt, four call-backs each of which is a complete coverage of 
the non-respondents in the preceding attempt, and a final interviewing stage. 
Now to get the same relative variance of one per thousand as in the procedures 
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given in the above table, we have to take an initial sample »,= 1000 (because 
var X <o?/n). The expected cost of the survey will then be 


a 
c= {(C + PC) + ry (CoP 2 + CP «41)1) + C.Puh = $860. 
j=l 

A comparison between this figure and the lst column illustrates very clearly 
the waste involved in complete coverage of non-respondents. The last row of 
the table shows that we can obtain the same variances as that resulting from 
complete coverage in the six attempts for about five-eighths only of the cost 
just by sampling the non-respondents in the proportions given in that row 
rather than taking them all into consideration. 

The third and fourth row show that this complete coverage in the six at- 
tempts is inferior even to 3 mail attempts followed by the interviewing stage 
when the optimum sampling fractions are used, 

The first two rows of the table illustrate also the importance of continuing 
the mail attempts to at least a third attempt. A procedure involving one mail 
attempt will require more than twice the cost of complete coverage of non- 
respondents in six consecutive attempts. A procedure involving two mail 
attempts will require a 16% increase over the latter cost. 

Example 2. The following example was constructed in order to illustrate the 
uses of inequalities (21) and (23). Consider an investigation where we know 
either before its commencement or from the returns of the consecutive attempts 
that 


Pr. 7 Px = A, Px 3 
g 


— —=.6 


o 


Notice that these proportions of non-response in the consecutive attempts re- 
sult from proportions of response equal to .3, .3, .1 of the total population in 
the three attempts. Suppose also that Co/C:=.3 and C;/C,=.1. 

Now inequality (21) shows that we are justified in performing the second 
attempt by mail if 


4 hs 4 49 4 
(1-5 x=) >( =;)(4- 1x) 
7 8 7 7 


which is true. 
A third mail attempt, however, will not be justified under the circumstances 


because 
3 6\?2 3 36 3 
(1 - x=) <(1- $x =)(4- 41x). 
4 7 4 49 4 


Thus our knowledge of the above proportions and relative variances of the 
non-responding groups has enabled us to come to the conclusion that the most 
plausible procedure is to make two mail attempts and a final interviewing step. 
The sampling fractions, the initial sample size, and the cost (or the relative 
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variance) are then calculated in the way given in the text and illustrated in 
Example 1. 

The decision to terminate the mail call-backs after the second attempt could 
be changed, however, if more information were available. Suppose for instance 
that we knew before starting the third attempt that Py:=.2, Ps2=.1, o42/0=.5 
and o5:/¢=.4. We would then find that although it is more advantageous to 
carry out the interviewing step after the second mail attempt than after the 
third, it is still more advantageous to proceed by mail to the fifth attempt. This 
follows by applying inequality (23) with m=2. This inequality is not satisfied 
for y=4, but when y=5 it becomes 


(ee aedyy 
[(-3x(e-xdp 
(Pee ngy 


5.8 


7 


The optimum number of mail attempts is therefore five rather than two. 


CONCLUSION 


A complete coverage of all the non-respondents in a survey that involves 
call-backs is not in general an optimum procedure because‘there are many 
cases where consecutive sampling of non-respondents at each stage will lead to 
a larger precision for the cost or to a smaller cost for the required precision. A 
set of sampling fractions which are optimum in this sense was obtained and 
the conditions for the applicability of the procedure were discussed. A step-by- 
step test as to whether a further mail attempt is justified and another test as 
to the optimum number of attempts is given. 

The applicability of the results requires knowledge of the proportions of re- 
sponse in the various attempts and the relative variances of the non-responding 
groups. The results should be most useful therefore in repeated surveys and in 
situations where past experience can be utilized. Even in situations where the 
investigator can predict the proportions and the relative variances of the non- 
responding groups only one step ahead he will be able to calculate the optimum 
sampling fraction in each attempt and to test whether the next call-back should 
be executed by mail or by personal interviewing. In these latter situations he 
will aot be able to calculate, before starting the investigation, the initial sample 
size that leads to the required precision or cost; but the procedure he will follow 
will be most profitable in terms of cost or precision. 

The discussion has beea confined to mailed questionnaires because we were 
assuming throughout that the cost of contacting individuals was constant in 
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all attempts except the final interviewing stage. It could be used, however, in 
an interviewing survey where we are ready to assume that the cost per interview 
in the consecutive attempts is constant, except for the final stage where, in order 
to reach those not available, cost becomes excessive. 
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THE ECONOMIC DESIGN OF X¥ CHARTS USED TO MAINTAIN 
CURRENT CONTROL OF A PROCESS* 


Acugson J. DuNCAN 
The Johns Hopkins University 


This paper establishes a criterion that measures approximately the 
average net income of a process under surveillance of an X chart when 
the process is subject to random shifts in the process mean. The quality 
contro! rule assumed is that an assignable cause is looked for whenever 
a point falls outside the control limits. The criterion is for the case in 
which it is assumed that the process is not shut down while the search 
for the assignable cause is in progress, nor is the cost of adjustment or 
repair and the cost of bringing the process back into a state of control 
after the assignable cause is discovered charged to the control chart 
program. 

The paper shows how to determine the sample size, the interval be- 
tween samples, and the control limits that will yield approximately 
maximum average net income. Numerical examples of optimum design 
are studied to see how variation in the various risk and cost factors 
affects the optimum. 


1, DETERMINATION OF A CRITERION FOR OPTIMUM DESIGN 


ONTROL charts used in statistical quality control are essentially of two 

kinds—those that are employed to bring a process under control and those 
that are employed in maintaining control. This paper will be concerned with 
the optimum design of the latter. 

On X charts used to maintain current control of a process, the central line 
is set at X’’ and the control limits are taken as X’’+k(o""/4/n) where X”’ and 
o’’ are “standard” values provided from past experience.' Samples of n are 
taken from the process every h hours and the sample X is plotted on the X 
chart. If a sample X falls outside of the control limits, it is assumed that some 
change in the process average X’ has occurred and a search is undertaken for 
the “assignable cause.” 

This paper seeks a theoretical basis for answering three questions regarding 
the design of X charts used to maintain current control of a process. These are 


(1) How large a sample should be employed? 
(2) At what interval should the samples be taken? 
(3) What multiple of sigma should be used in determining the control limits? 


The paper will seek a procedure for determining the design that maximizes 
for the process the long run average net income per unit of time, on the assump- 
tion that we have knowledge of the risk of occurrence of an assignable cause 
and knowledge of various cost and income parameters. This maximum income 
criterion is the one of most interest to business and is therefore the natural 





* The writer is greatly indebted to I. R. Savage and G. Greggory of Stanford University for their criticism and 
suggestions in preparation of this paper. The paper was completed while the writer was working at Stanford Uni- 
versity under the auspices of the Office of Naval Research. 

1 In this paper unprimed values are sample values. Those with primes are process or universe values and those 
with double primes are standard values. Glossaries of symbols will be found at the end of the paper. 
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one to apply.’ It has been applied to the design of sampling plans (See [1], 
[7], [8], [9], [10], [12], [13] and [14]) and has already been used in the study 
of optimum quality control rules (See [6]). This section of the paper will be 
concerned with formulating a mathematical function for measuring the average 
net income from a process operating under surveillance of an X chart. 

We shall assume at the start that the control chart is maintained to detect 
a single assignable cause that occurs at random and results in a change in the 
process of known proportions. In Section 4 the case of more than one assignable 
cause will be discussed. 

It will be presumed that the process begins in a state of control et the level 
indicated by the standard values. This means that initially the process mean 
X’ equals the standard velue X’’ and the process standard deviation o’ equals 
the standard value o’’. The assignable cause will take the form of a shift in 
X’ from X”’ to X¥’’+6e"’ or from X"’ to X’’—Se’’. It will be assumed that the 
standard deviation remains equal to a’’. 

To be specific about the assignable cause we shall assume, in accordance with 
familiar waitine time analysis [5], that the probability of its non-occurrence 
before time ¢ when starting from a state of control is e~ and the probability of 
its occurrence in the interval ¢ to t-+-At is approximately \e~‘At. The average 
time required for the assignable cause to occur wil! be 1/A. 

If samples are taken at intervals of h hours, then, given the occurrence of 
the assignable cause in the interval between the nth and n+1st sample, the 
average time of occurrence within an interval between samples will be 


(n+1)h h 
f e™r(t — nh)dt ou f e>?)TdT 
n. 0 


h 


1 — (1+ Ahje™ 


(n+1)h : h (1 wits e~>*) 
f edt com f e>TdT 
n 0 


uy 





hh? : 
rae plus terms of order \*h‘ or higher. 


Numerical studies of a simpler model suggest that when the X chart is de- 
signed to detect shifts in the process mean of 2c’ or more, optimum values of 
h are likely to lie between 1 and 10. If A equals .01 say, and h=1, then \*h* 
will be of the order of .000001 compared with (h/2) — (Ah?/12) = .4992. If \=.01 
and h= 10, then A*h‘ will be of the order (.000001)(10,000) = .01 compared with 
(h/2) — (Ah?/12) = 4.92. If X\=.05 and h=20 so that Ak=1, the result given by 
the exact equation will be 8.36 whereas (h/2)— (Ah?/12) =8.33. In formulating 
the average net income function, therefore, the average time of occurrence of 
the assignable cause within an interval of size h is taken as (h/2)—(Ah?/12). 
This simplification makes the formula for average net income an approximate 
one, but it is very helpful in providing a formula with which we can work. The 
approximation is apparently a good one and in any particular case the exact 





2 Previous papers have sought optimum sample sizes for control charts by minimising the average amount of 
inspection required to detect a shift in the process ((2], [15], [16], [17]). Another paper has determined the frequency 
of selecting samples for attribute inspection by limiting the probability of failing to detect a production run of a 
specified number of items [4]. 
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formula can be used to check a proposed design if this is believed to be desirable. 

It will be assumed that in the operation of the X¥ chart the rule will be fol- 
lowed of taking action when a sample point falls outside the control limits. 
Other rules might not require action unless two points in succession fall outside 
the control limits. In still other cases, action may be taken when unusual pat- 
terns occur within the control limits even though no point has fallen outside 
the limits. The question of what is the optimum rule is beyond the scope of 
this paper (see [6]). We shall restrict ourselves here to the simple rule of taking 
action only when a point exceeds the control limits. 

When the assignable cause has occurred, let P be the probability that the 
assignable cause will be detected, and Q the probability that it will go un- 
detected, on the occasion of a single sampling of the process. In accordance 
with the rule adopted P will be the probability that a sample point will fall out- 
side the control limits and Q that it will fall within the limits when there has 
been a change in the process mean. Thus, when the process mean has shifted 
from X”’ to X'’+é0"’, 


—k-tV¥n 9 2*/2 26 en /2 
P= f — dz + f == dz 
~e V2" rava V20 
In most instances, 6>0, the first term is practically zero. (For a negative 4, 


the second term would be practically zero.) When the process is in control, the 
probability of a saraple point falling outside the control limits is 


2 g-#l2 
a-2f 
k 


It will be assumed that the rate of production is sufficiently high that we can 
neglect the possibility of a change in the process occurring during the taking 
of a sample. 

In the particular model discussed in this paper, the process is allowed to 
continue in operation during the search for the assignable cause. It is also 
assumed that the cost of adjustment or repair (including possible shutting down 
of the process) and the cost of bringing the process back to a state of control 
subsequent to the discovery of the assignable cause will not be charged against 
the net income from the process—at least for the purpose of determining opti- 
mum control chart design. Other theoretical models under study allow for a 
shutting down of the process immediately following the discovery of a point 
outside the control limits, and for the charging of the cost of adjustment or 
repair to the net income from the process. The present model is apparently the 
simplest and has therefore been studied first. 

On the basis of the foregoing we may determine in the following way the 
long run average net income per hour related to the operation of the process 
under surveillance of an X chart. 

(a) We note again that the average time for occurrence of the assignable 
cause will be 1/A. 

(b) After the occurrence of the assignable cause, the probability that it will 
be caught on the rth sample taken after the shift is Q-"P. The mean number 
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of samples taken before the shift in the process will be caught will accordingly 
[5] be 1/P. If his the interval between samples and if we assume as above that 
the change occurs on the average approximately at the point (h/2) —(Ah?/12) 
in the interval, then h/P—(h/2—\h?/12)=(1/P—1/2+)Ah/12)h will be 
approximately the average time the process will be out of control before a 
sample point falls outside of the control limits. 

(ec) We assume that the time required to take and inspect a sample and to 
compute the results is proportional to the sample size n, viz., delay in plotting 
a point =en. 

(d) Let D be the average time taken to find an assignable cause after a point 
has beer found to fall outside the control limits. Then from the above it follows 
that in many repetitions the proportion of the time a process will be in control 
will be* 


1/r 
~ I/\ + (1/P — 1/2 + Xh/12)h + en + D 





8 


and the proportion of the time it will be cut of control will be 


_ (A/P = 1/2 + Mh/12)h + en + D 
Y™ 1/. + (1/P — 1/2 + dh/12)h + en + D 


It should be noted that 8 and y pertain to portions of normal operating time 
excluding the time devoted to elimination of the assignable cause after it has 
been found and to restoration of a state of control. 

(e) When the process shifts from X’’ to X¥'+60"’ or X"’—6o"’, it will be 
assumed that the proportion of defective items produced will be increased. Let 
V be the average income per hour accruing from operation of the process under 
controlled conditions at the standard level X’’, and let V; be the average in- 
come per hour accruing from operation at the new level X’’+ée’’, and let 
M=V,—V;. It will be assumed that the process was originally centered be- 
tween the specification limits so that M will be the same for —é as for +4. 

(f) If h is the interval between samples measured in hours, the expected 
number of false alarms before the process goes out of control will be a times 
the expected number of samples taken in the period. This is 





x (+1) A e) x 
«> f edt = a Die — e499) = a(1 — ™) Ste 
t=0 th i=0 i=0 


a « —h 
= —a(l —e«) — — yy em sx 
Orn h io i-o™ 


which, if we neglect terms of order \*h? or higher,‘ is equal to a/Ah. The ex- 
pected number of false alarms per hour of operations will thus be approxi- 
mately 





* The writer is greatly indebted to Paul Meier of The Johns Hopkins University for his suggestion of this simpli 
fied approach. 

* Compare use of (h/2) —(1h2/12) as the approximate average time of occurrence of the assignable cause within 
an interval. 
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h 





Ah 
n+ (1/P - 12+ —)h+ en + D 


If T is the cost of looking for an assignable cause when none exists, then the 
expected loss per hour of operations because of false alarms will be approxi- 
mately BaT'/h. 

(g) Since the average cycle length of in-control: out-of-control is 


hh 
1/A + (vp —- 1/2+ ~)h + en + D hours, 


the average number of times per hour that the process actually goes out of 
control is 


1 





1 ( 1 1 pe *) h+ D 

eee eee od 
If the average cost of finding the assignable cause when it occurs is W, the 
average cost per hour on this account will be eW. 

(h) The part of the hourly cost of maintaining the control chart that is 
related to its design we assume is given by the simple linear function 
(b/h)+(en/h), where b is the cost per sample of sampling and charting that is 
independent of sample size, and c is the cost per unit of measuring an item of 
product and other control chart operations directly related to the size of the 
sample. 

(i) From the above it follows that the net income per hour will average over 
a long period of operations approximately as follows: 





1 Olle + Manet atin nero nie 
h h h 
If we note that V;= Vo—M and that 8+~7=1, this becomes 
ray, qeoy Brig oo 
h h 
Upon inserting the values of 7, 8 and ¢, we get 
R, 7 a re ee ee (1) 
1 +B h h 
where B=(ah+en+D) and 
a = (1/P — 1/2 + dh/12). (2) 


If we set 
AMB + aT/h + W b cn 
L= a +—+ 


ao om 9 
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which may be called the loss-cost, then J will be approximately a maximum for 
n, h and k when L is a minimum, since Vo is independent of these variables. 
Our criterion of optimum design will therefore be minimum L. 


2. FINDING AN APPROXIMATION TO THE OPTIMUM DESIGN 


Numerical study of the function L suggests that for realistic values of the 
parameters 6, A, M, e, D, T, W, b and c, a local minimum does exist in the 
neighborhood of values of n, h, and k that would be used in practice. No attempt 
is made here to study other possible minima.’ Rather, attention is directed 
primarily to deriving a relatively convenient procedure for approximating the 
jocal minimum that could be adopted in practice. The derivation of this pro- 
cedure follows: 

First, let us note what relationships must exist between the optimum values 
of n, h and k. Thus setting equal to zero the partial derivatives of L with 
respect to n, h and k (noting that P is a function of n and k, and a is a function 
of k and for the moment treating n as if it were continuous), we get 








0B aT 
ws — (at —  — ar) + (1 + 0B) = 0 (3) 
on h 
0B aT 
dh? Oh (ar - . ~ Mw) — aT(1 + AB) — (b+ cn)(1 + AB)? = 0 (4) 
a (ua) += Sa + nw) =0 (5) 
ok h h dk 
where 
0B haP/dn 
as a dl 
dB 1 1 hh 
Sak Chee peak. 
aB haP/ak 
> Salietons sans 


Equations (3), (4) and (5) do not yield simple expressions for evaluating 
the optimum n, h and k. Approximations are therefore sought. When likely 
numerical values for the given parameters are substituted in (4), say A=.01, 
M =$100, e=.05, D=2, T= $50, W = $25, b=$.50 and c=$.10, and it is guessed 
that a will be small, say .003, and h will be 1 or 2, so that terms like a7'/h and 
\B may be neglected, it is seen that the optimum h is roughly of the order of 
1/-/x. We therefore boldly seek approximations to the solutions of the exact 
equations (3)—(5), by assuming A small and neglecting all terms in an equation 
of a smaller order of magnitude than the principal term. This gives us 





§ No evidence of the existence of other minima has been found. 
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M2MAaP/dn ft F 
—_ “Witigee —+c=xQ0 (3’) 
1 1 
we (—- >) - ar -b- nso (4’) 
P 2 
\h?M OP /dk Oa 
— ————— + T— = 0. (5’) 
Pp? dk 


Equation (4’) immediately gives us 


al +b+en 
h* /- : (6) 
AM(1/P — 1/2) 


Usirg this approximate value of h in (3’) we get after some rearrangement, 


P*(1/P — 1/2) “ aT +b 
dP /an c (7) 








Fig. 1 is a chart for solving (7) for n for selected values of k and é. For 6=2, 
Fig. 1 is supplemented by Tasxe 1. 
Finally, when equation (5’) is combined with equation (3’), it is found that 


da Qev/n 


ok 6T 





or 


ew 12 cn 


V2x ss 8T 








(8) 


Equations (6), (7) and (8) provide a relatively simple procedure for approxi- 
mating the optimum values of n, h and k for given values of 6, A, M, e, D, T, 
b, and c. The steps to be taken are as follows: 

As a guess, start with a= .0124, which corresponds to an “average” k of 2.5, 
find an approximate n from Fig. 1 or TasiE 1 for the selected value of 5. Insert 
this n in equation (8) and from the normal probability tables find an approxi- 
mate k. Compute the @ corresponding to this first approximate k, go back to 
Fig. 1 or TABLe 1, and find a second approximate n. Use this again in equation 
(8) to find a second approximiate k. (Usually it is not necessary to proceed toa 
third approximation.) Use the second approximate n and k in equation (6) to 
compute an approximate h, noting that for 6>0, P equals practically 


* e~ 27/2 
f = dz 
iva V2e 
or the area under the normal curve from k—38./n to ©. It is recommended 
that k and h be rounded off to the nearest 10th; n, of course, must be an integer. 
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To illustrate this procedure, let 5=2, A=.01, M=$100, e=.05, D=2, 
T=$50, W=$25, b=$.50, and c=$.10. Then, for a=.0124, corresponding 
to k=2.5, we have (a7'+b)/c=11.2. Using Tasue 1, we derive a first approxi- 


32 8215 3s} 5 


FIG.| 
2 (lep-! 
GRAPH OF Pt AS A FUNCTION OF n 


TO FIND NEAR OPTIMUM n TAKE ORDINATE EQUAL To SIf2 


TABLE 1 
P*(1/P — 1/2) 


—~n FOR 3 =2 
aP/on dit 


SELECTED VALUES OF 


111. 
328.8 


mate n of 5. For this n, we have cvn/8T = .00223, which yields k=3.2. For this 
k the (a7'+b)/c=5.7 and Tasie 1 suggests that the second approximate n 
should be 5. This is the same as the first approximate n and yields a second 
approximate k of 3.2. With n=5, k=3.2, we get a=.00137, P=.89831, 
(1/P—1/2) =.61320. and h=1.3. The optimum chart for the given cost and 
risk factors is thus one that has values approximate to n= 5, h=1.3 and k=3.2. 

This approximate method was checked for 12 different examples by comput- 
ing by trial and error the exact design that will minimize the loss-cost L 
(within +0.1 for both h and k). The results are shown in TaBux 2 together with 
other data of interest. In every instance but one, the approximate method yields 
a design for which the loss-cost is within 3% of the minimum and in most cases 
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is better than this. TaBLE 2 also affords a comparison of the loss-cost yielded 
by the approximate optimum designs with that yielded by an arbitrarily de- 
signed X chart for which n=5, h=1, and k=3. 

The approximate procedure takes no account of the values of e and D and 
was obtained by assuming that these quantities were not exceptionally large. 
For e=.05 and D=2, Tasux 2 shows, as we have seen, that the approximate 
procedure works very well. It also appears to work well for D at least as large 
as 20. When the exact optimum design was worked out for parameter values 
equal to those of Example 1 except that D was given the value 20, the result was 
n=6, h=1.8 and k=3.2, which is ~ ot far different from the optimum design 
for D=2, viz., n=5, h=1.4, k=3.1. The loss-costs for the two designs applied 
to the case D=20, are, for 100 hours of operations: 


n=6 h=18, k=32 L = $1,836.55 
n=5, h=14, k=31 L = 1,838.96 


The difference is insignificant. 

For relatively high values of e, the approximate procedure does not appear 
to work so well as for relatively low values. When e was given the value 0.5 
instead of 0.05, the optimum design was n=2, h=.9, k=2.7. This differs mark- 
edly from the optzmum design when e=.05, viz., n=5, h=1.4 and k=3.1. The 
loss-costs of the two designs applied to the case e=0.50 are, for 100 hours of 
operations: 


n=2, h= 9 k=27 L = $540.35 
n=5 hwl4, k=31 L = $607.92 


This difference is over 11% of the L for the exact optimum design. 

The inaccuracy of the approximate method when e is relatively high is not 
a serious condemnation of the approximate procedure, since in most cases in 
practice e will be relatively small. It is to be remembered that ¢ is the rate at 
which the time between the taking of the sample and the plotting of the sample 
point on the X chart increases with the sample size n. It is mainly dependent 
on the time required to test an item. If the tests must be run in succession by 
a single inspector, an e=0.05 means that it takes approximately 3 minutes to 
test each piece. Under the same conditions an e=0.5 would mean that each 
test would require approximately a half an hour. If M is large, it is unlikely 
that e would be large, since it would probably pay the management to have 
several sets of test equipment and possibly several inspectors. If it turns out 
that e is necessurily large, then, as noted above, the approximate procedure 
should be used with caution. An amplified model might add e to the other 
design elements, n, h, and k, for which optimizing values are to be determined. 

The approximate method also takes no direct account of the value of W. 
It is assumed throughout the numerical analysis, however, that W is always 
50% of T and in any case would generally vary with T. Variations in W relative 
to T were not considered to be worth investigation. 
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3. EFFECT OF VARIATION IN RISK AND COST FACTORS ON TRE OPTIMUM DESIGN 


The effect of variation in risk and cost factors on the optimum design may 
be gleaned from a study of Tasie 2. The values assigned to the cost and risk 
factors in this table cover a wide range of possibilities and are believed to be 
generally typical of industrial costs and risks. The loss-rate M is the reduction 
in income per hour that is attributed to the occurence of the assignable cause. 
The values of M are derived from the assumption that the rate of production 
is constant and the specification limits on the process fall at ¥’’+30e’’. When 
X’ changes by 20’’, M is arbitrarily given the value $100 per hour. The other 
values of M($i2.87 and $2.25) follow if we assume that the quality character- 
istic X is normally distributed and that M is proportional to the increase in the 
percentage of defective items. 

TABLE 2 suggests the following general conclusions: 

(a) The optimum sample size is largely determined by 4. If we wish a control 
chart to detect shifts in X’ of 2c’’ or more, sample sizes of 2 to 6 are likely to 
be optimum. If we wish to detect shifts of 1o’’ sample sizes of 8 to 20 are likely to 
be optimum. If, however, we wish to detect shifts of 0.80’ ’ or less, we must 
generally take samples of 40 or more. It would thus seem that the X charts 
now commonly in use are good for detecting the larger shifts in X’ but not for 
detecting the smaller shifts. 

This inadequacy of the usual X¥ chart employing samples of 4 or 5 to detect 
small shifts in X’ is considered by H. Weiler a serious condemnation of accepted 
control chart procedure [15]. The present writer does not share these senti- 
ments. If process specifications are at X’’+4e’’, i.e., the product is controlled 
well within specifications, then small shifts in X’ may occur without producing 
any defective material.‘ Even if specification limits are at X’’+30’’, a shift in 
X’ of 0.50’’ will produce only 0.37% more defective material, if the output is 
normally distributed. This might be serious if no defective material at all 
could be tolerated and the control chart was used for acceptance procedures. 
Such use of the control chart is unlikely under these circumstances. Small 
shifts in X’ do have a serious effect on the percent defective if the specifications 
fall at X’’+2e’’ or closer, but under such circumstances the control chart is 
likely to be supplemented by 100% attribute inspection or some AOQL attri- 
bute sampling plan. 

(b) Variation in the loss rate M has its dominant effect on the optimum 
interval between samples, h. When M is relatively small, h should be large; 
when M is relatively large, h should be small. Variation in M has little effect 
on the optimum values of n and k. 

(c) Variation in the cost of looking for trouble when none exists (7) and the 
accompanying variation in the average cost of looking for trouble when it does 
exist (W) have their primary effect upon the optimum value of k. The sample 
size n is affected moderately. Thus for small 7 and W we should use charts 
with 2.5 sigma limits, for example, but with large T and W we should use 
charts with 3.5 and 4 sigma limits. Variation in T and the accompanying varia- 
tion in W have practically no effect on the frequency of sampling. 


* It will be noted that the process standard deviation ¢’ is assumed to equal the standard value ¢”, both initially 
and after the shift in ¥’. 
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(d) Variation in the unit cost of inspection and charting (c) affects all three 
of the elements of design. For high values of c, the optimum design calls for tak- 
ing smal] samples, possibly only samples of 2, at large intervals between sam- 
ples, and with control limits at low multiples of sigma. In the numerical exam- 
ples of TABLE 2, the only cases in which an optimum k was less than 1.5 were 
those for which c had a relatively large value. 

(e) Variation in the cost of “visiting” the process to take a sample (b) affects 
primarily the frequency of sampling. It also has a moderate affect on the 
sample size. 

(f) Variation in the delay factor (e) affects all three of the design elements. 
A large e leads to a smaller sample size, a shorter interval between sampies, and 
tighter control limits. 


4. SEVERAL ASSIGNABLE CAUSES 


When the model under study is expanded to include the design of an X chart 
with respect to several assignable causes, only minor changes need be made in 
the theoretical argument. If we assume that the various assignable causes occur 
independently, we may use the waiting time theory referred to in Section 1. 
On the assumption that the process starts in « state of control, the probability 
that no assignable cause will have occurred at the end of time ¢ will be e~™**. The 
probability that cause 7 occurs in the interval ¢ to t4-Ai and other causes occur 
no sooner than this will be 


e~ 4) At, 


and the probability that any of the several causes occurs in the interval ¢ to 
t+ At and no cause occurs before that will be approximately 


es F* Al. 


Over many repetitions, therefore, the mean time the process will be in control 
will be approximately 


f “ete f > Adt = 
0 


1 
pee 
As before, when the process goes out of control because of cause j, the 


average length of time it will stay out of control before the assignable cause is 
aiscovered will be approximately 


jh 
| (1 ~ 1/2 + ~) + en + D,|. 


Furthermore, the proportion of the time that it will be cause j that is present 
when the process is out of control will be \,/ rv. Hence, when there are 
several assignable causes, the net income function will take the form 

I’ = BVo t+ 1Vi + vaVa + ysVs+ --- 


where 
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1 d, 
a+ 25 =x [(1/P. — 1/2 + dsh/12)h + en + Dz] 
[(1/P; — 1/2 + djh/12)h + en + D;] 





8 = 





Ex 
4. 


b [(1/P. — 1/2 + Ash/12)h + en + D,] 


or Ns i 
1 

1/ > vs + As[(1/P; — 1/2 + Aph/12)h + en + D;] 

If M;= Vo— Vj, this can be written 





y= 


aT b cn 
iV, - LM; — B= - — ba Os = ae 


and if we define 
aT b cn 
L' @ Lv; + 8 —— + LeWet += 


the problem of finding the optimum design is that of finding values of n, h, and 
k that will minimize L’. 

This section on several assignable causes is included to complete the theo- 
retical argument. In practice, it will probably be determined at the start 
whether a chart is to give protection against smal!, medium or large shifts in 
the process mean and then the theory and procedure of Sections (1) to (3) may 
be applied. 


5. EXTENSION OF THE ANALYSIS 


The preceding discussion has been confined to X¥ charts but the theory of 
Section 1 can be applied to any type of control chart such as an R chart, p chart 
or ¢ chart. In all these cases, however, the task of working out practical approxi- 
mate procedures is yet to be accomplished. 

The application of economic principles to the design of acceptance sampling 
plans has already been considered by Anscombe [1], Hamaker [7], Moriguti 
[8], Satterthwaite [9, 10], Sittig [12], Ura [13] and Weibull [14]. Profitable 
investigation along these lines will probably continue. 


GLOSSARY OF PARAMETERS 


8—The mean of the process is assumed to shift by de’’. 
\—-the parameter related to the probability of occurrence of the assignable 
cause. Starting in a state of control at time ¢=0, the probability the process 
will still be in control at time ¢,; is e". 
V,—the rate per hour at which income accrues from operation of the process 
when in a state of control. 
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V,—the rate per hour at which income accrues from operation of the process 
when the mean of the process has shifted by 5e’’. 
M—equals V»— Vi. 
e—the rate at which the time between the taking of a sample and the plotting 
of a point on the X chart increases with the sample size n. Delay equals en. 
D—the average time taken to find the assignable cause after a point plotted on 
the chart falls outside the control limits. 
T—the cost per occasion of looking for an assignable cause when none exists. 
W—the average cost per occasion of finding the assignable cause when it exists. 
b—the cost per sample of sampling and plotting that is independent of the 
sample size. 
c—the cost per unit of sampling, testing and computation that is related to 
the sample size. The relationship is assumed to be linear. 
o’—the standard deviation of the process. 
o”’—the “standard” value taken for sigma in setting up the control limits. 
Throughout the analysis o’ =0¢’’. 


GLOSSARY OF VARIABLES 
n—the sample size. 


h—the interval between samples measured in hours. : 
k—the control limits for the X chart occur at X’’ +ko’’//n. 


—k-iV n ent l2 oo e~ 2 
P -f —_ a+ f —— dz 
—o V2r e-sva V20 


Q=1-P 


k 


L—the loss-cost. L equals 
AMB+aT/h+W 6b en 
? 





1+ B ait | 


B = (ah + en + D) and 
a = (1/P — 1/2 + h/12). 
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ECONOMICALLY OPTIMUM ACCEPTANCE TESTS* 


Joun V. BREAKWELL 
North American Aviation 


An economical balance is sought between the cost of testing and the 
expectation of loss due to accepting an insufficiently reliable product or 
rejecting a sufficiently reliable one, the desired reliability being very 
high. When the articles to be tested are classifiable into just two groups, 
satisfactory or defective, use is made of Poisson approximations to ob- 
tain “optimum” sequential] tests as well as optimum fixed size tests. 
The results are presented in charts. When the articles to be tested are 
classifiable according to some physical measurement whose underlying 
distribution is supposed to be normal, but is otherwise unknown, the 
optimization is based on previously obtained solutions using normal 
approximations. Examples are given. 


1, INTRODUCTION 


upPosE that a new product is to be tested for acceptance or rejection. The 

problem to be discussed here is that of choosing the test te conform with a 
certain economic background. This background consists of: (1) a certain loss if 
we accept a product whose fraction p of defectives is greater than a known criti- 
cal fraction p.; (2) a certain loss if we reject a product whose fraction p of de- 
fectives is actually less than the critical fraction p,; (3) the cost of the test. The 
losses (1) and (2), moreover, will in general be increasing functions of the differ- 


ence between the critical fraction p, and the actual fraction p. 

Neyman-Pearson theory considers the problem of constructing a most power- 
ful test, at a given sample size and significance ievel, for distinguishing between 
a below-critical fraction p, of defectives and an above-critical fraction ps2. The 
choice of p, and p, has to remain somewhat arbitrary. A more serious disad- 
vantage of the Neyman-Pearson approach is its failure to relate the choice of 
sample size and significance level to the economic background. 

A satisfactory approach to this type of economic problem was introduced 
by A. Wald [8]. He considered the following risk function: 


risk = (less due to wrong decision) X (probability of wrong decision) 
+ (cost of test). 

This risk is a function of the (unknown) fraction of defectives as well as of the 
test procedure. As far as the dependence on the test procedure is concerned it 
is apparent that some balance can be achieved between the contributions of 
the economic losses (1) and (2) by varying the acceptance level, while some 
balance can be achieved between the “expected loss” due to a wrong decision 
and the cost of the test by varying the sample size. 

The criterion adopted here for choosing the “optimum” test out of any family 
of test procedures is the minimaz criterion: choose that test procedure for which 
the risk function, maximized with respect to the unknown fraction p of de- 
fectives, shall be as small as possible. 





* Presented at the “Reliability of Complex Systems” session of the joint meeting of the American Statistical 
Association and American Association for the Advancement of Science in Berkeley, California on December 28, 1954. 
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2. SPECIFIC CHOICE OF RISK FUNCTION 


The cost of the test will be assumed to be proportional to the sample number. 
By a suitable choice of the unit of cost we may write: 


[Cost of test] = N, the sample number. 


The losses due to wrong decision will be assumed to be proportional to the 
difference between the actual fraction p of defectives and the critical fraction p-.: 


[Loss due to accepting product whose p > p.| = C(p — p.); 
[Loss due to rejecting product whose p < p.] = C’(p. — p); 
where the constant C’ may be much less than the constant C if we suppose 


that rejection of a good product is less serious than acceptance of a poor prod- 
duct. The risk function is thus: 


‘ m”~ — p.)Pr{accept} + ¥, if p < p., 
C'(p.e — p)Pr {reject} + N, if p < p., 
where WV denotes the expected sample number in case, as in a sequential test, 


the sample number is not fixed in advance. The general appearance of r as a 
function of p is indicated by the full curve in Diagram A. 





0 P 


DiacrRaM A. Risks vs. Fraction or DEFECTIVES 


In accordance with the minimax criterion, the test must be chosen to bring 
the high point (or points) in Diagram A as low as possible. 


3. THE OPTIMUM FIXED SIZE TEST FOR A VERY LOW 
FRACTION OF DEFECTIVES! 


Suppose now that the articles to be tested are classifiable into just two 
categories, satisfactory and defective. A fixed size test consists of testing NV 
articles and counting the number f of defectives among them. The product will 
be rejected if f exceeds a certain integer F, and accepted otherwise. Because 
of the general disparity hetween C and C’ we do not assume that F = Np, (or 
the nearest integer thereto). We do, however, assume that both F'/N and p, are 
very small in comparison with unity. If the true fraction p is also small, the 
probability of acceptance is given by the Poisson formula: 





1 The solution to this problem for the special case C’ =C has recently been given by 8. Ura [6]. 
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F (Np) 
Prifs F} = > OP 
go off 
Making use of this formula, the optimum choice of test parameters F and N 
(so as to minimize the maximum risk) has been obtained after much numerical 
computation in terms of the quantities p., C, C’. The optimum fixed sample 
number is: 


exp. [—Np]. (3.1) 


N = VC, (3.2) 


where v as well as F are shown in Fig. 1 as functions of the economic param- 
eters p and 4, defined by: 


(3.3) 


and 

















































































































p= c/c' 


Fic. 1. Optimum fixed size test parameters (Poisson approximation). 
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= pv. (3.4) 


The corresponding maximum risk is: 
Tmax = w/C, (3.5) 


where w is shown in Fig. 2 as a function of p and 6. 

At the bottom right of Figs. 1 and 2 the economic situation calls for outright 
rejection, the economic advantage to be gained from a very reliable product 
being more than offset by the cost of the necessary test. The corresponding risk 
function has the simple form of the dotted line in Diagram A. Over the re- 
mainder of the “economic plane” defined by the economic parameters p and 6 
the risk function has the form of the full curve in Diagram A, with two maxi- 
mum points; one, between p=0 and p=p,, at p=p; say, and the other in the 
range p>p-, at p=p», say. The two maximum values for the risk are equal 


1000 
8 


4 6 8 © 


p=C/c’ 


Fie. 2. Maximum risk in optimum fixed size test (Poisson approximation). 
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except over a small region at the bottom left of Fig. 1 below the dotted line 
shown, in which region the maximum at p, is not quite as large as that at po. 
The locations p; and p, of these maximum points as functions of C, C’ and p, 
are not reproduced here, but, over the range 1 Sp<100, p, turns out never to 
exceed 2.7 p.. In the upper range of p values, moreover, the Poisson approxima- 
tion actually yields an upper bound to the risk. Since the Poisson approximation 
does indeed lead to a risk function of the general shape shown in Diagram A, it 
follows that, provided p, is small, the use of the Poisson approximation in con- 
junction with the minimax criterion is well justified. 

If p. is smail but 6(=p./C) is greater than 1,000, p: and p,; are very close to 
p. While Np.(=vé6) is large enough so that the Poisson distribution of f, when 
p is anywhere near p; or pz, may be satisfactorily replaced by a normal dis- 
tribution with mean vé and variance vd. This leads to a different form for the 
solution, a form given previously in reference [1] and valid also when p, is not 


small, namely: 
C+C\2 
N = (=) (peqe)'*v11(p), 


2 ” 1/3 
F/N =p. - (. rs) Bur(p), 


C oa Cc’ 2/3 
Tmax = ( 2 ) (Dee) Fwr1(p), 





where q.= 1—p, and where the functions vz;(p), 8r(o) and wrr(p) are reproduced 
here in Fig. 3. 


4. THE OPTIMUM WALD-TYPE SEQUENTIAL TEST FOR A VERY 
LOW FRACTION OF DEFECTIVES 


Some reduction in maximum risk is achieved by optimizing a Wald-type 
sequential test [7] (see Diagram B) instead of a fixed size test. 


= 





< 
a 


NUMBER OF 
DEFECTIVES, f 





a 


NUMBER TESTED,N 


Dracramu B. Tue Sequentiat Test 
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= 
p=C/C' 


Fig. 3. Optimum fixed size test parameters (Normal approximation). 


The typical test is defined as follows: 
Reject if and when f > mN + yp; 


Accept if and when f S mN — ya. 


The determination of the optimum test parameters yr, ys, and m, in terms 
of p., C, C’, based on statistical formulas, [2], [4], related to the Poisson 
formule, has recently been completed over the range 1 SC/C’S100 with the 
aid of an IBM “701” computer. The results are: 


Yr = az, (4.1) 
Ya = (1 7 az, (4.2) 


-_ C-12§1/88, (4.3) 
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where the test parameters z, a, 8 are shown in Figs. 4 and 5 as functions of the 
economic parameters p and 6 defined by (3.3) and (3.4). The corresponding 
maximum risk is, as in (3.5), 


Tmax = wr/C, (4.4) 


where the new function of p, 6 is shown in Fig. 5. Again, as in the case of the 
fixed size test, there is an economic region, at the bottom right of Figs. 4 and 5, 
calling for outright rejection. There is here, moreover, an intermediate region, 
corresponding to z<1, in which the optimum Wald-type test is a “limited 
sequential test” definable as follows: 


Test not more than N articles; 
Reject if and when a defective is obtained; 
Accept otherwise. 
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p2=C/C' 


Fia. 4. Optimum sequentia) test parameters z, a, »y (Poisson approximation). 
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Fia. 5. Optimum sequential test parameters 8, w (Poisson approximation). 


The optimum value of N is, as in (3.2), 
N = VC, (4.5) 


where the new function » of p, 6 is shown in Fig. 4. 

Except in the case of outright rejection the risk function again has the general 
appearance of the full curve in Diagram A, having, moreover, two equal 
maximums. Again their locations p; and p, as functions of C, C’ and p, are not 
reproduced here, but, over the range 1<p2100, p: turns out never to exceed 
2.4 p.. Again, then, the Poisson approximation is justified when p, is small. 

As in the case of the fixed size test, when 4 is greater than 1,000 the Poisson 
approximation may be replaced by the normal approximation, led@ing in this 
case to the solution given on p. 63 of reference [1], namely: 
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C+0\" 
Ya = ( : (PeQe)?*na(p); 


c+c\n 
Yr = ‘cn (PeQe)**nr(p), 


2c ay 
C + Cc Br(p), 


C 4: Cc’ 2/3 
Tmax = ( 2 ) (PeGe)**w1(p), 


where the functions na(p), ne(p), &r(p) and wr(p) are reproduced here in Fig. 6. 
Comparison of Figs. 5 and 6 with Figs. 2 snd 3 shows that the optimum Wald- 
type sequential test leads to a lower maximum risk thun the optimum fixed 
size test. Accordingly, the region calling for outright rejection in Fig. 4 does not 
extend as far as that in Fig. 1. 

It is interesting that above a certain boundary line coinciding with the upper 
boundary of the shaded region in Fig. 4 and extending to the left into the 
limited test region, the computed Wald-type sequential test is optimum among 


m= Pe 





8 10 
p= C/C' 


Fia, 6. Optimum sequential test parameters (Normal approximation). 
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all possible test procedures. This optimum character of the Wald-type test 
extends in particular to the region in which the normal approximation is valid. 
It may also extend throughout the limited test region, if mixed strategies are 
excluded, but this is only a conjecture. It does not extend, however, to the 
region shaded in Fig. 4. In this latter region some improvement over the indi- 
cated test may be obtained by resorting to curved acceptance and rejection 
boundaries (Diagram B), with both boundaries curving away from the accept- 
ance region as N increases. 

An optimum mixed strategy may be computed at each point of the economic 
plane below the above-mentioned boundary line, this strategy being mixed in 
some ratio between outright rejection and a Wald-type test. This leads to a 
somewhat lower maximum expected risk than the pure strategy indicated in 
Figs. 4 and 5. 

Example 1. A commercial photographie concern expects to use 100,000 flash 
bulbs in non-repeatable pictures during the coming year. Suppose that it is 
desired to test a new type of flash bulb for possible use during the year and 
that the reliability of the old type is known to be 99.0%. Suppose that each 
type costs 15¢, whereas the loss in costs and labor for every defective flash bulb 
averages $2.50. What is the appropriate test? 

If p is the fraction of defectives in the new type, the expected loss in costs 
and labor is $250,000 x p. Thus, if p exceeds 0.010, but the new type is accepted, 
the loss attributable to acceptance is $250,000 X(p—6.010). Similarly, if p is 
less than 0.010 but the new type is rejected, the loss attributable to rejection is 
$250,000 X (0.010 — p). The cost of the test, apart from a possible initial sum, is 


15¢ per bulb, assuming that each bulb is expended in the test. Adopting 15¢ as 
the unit of cost we see that 


250,000 
CO ere 1.67 & 108 


and p.=0.010. 

Entering Figs. 1 and 2 with 6=p../C=12.9 and p=C/C’=1, we read: 
v=0.46, F==5, w=1.35. The appropriate fixed test size, from (3.2), is thus 
N = 590, the maximum allowable number of defectives is 5, and the associated 
risk, from (3.5), is Tmax = 1740 X$.15 = $260. 

From Figs. 4 and 5 we obtain the appropriate sequential test. Corresponding 
to 6=12.9 and p=1 we read: z=4.3, a=0.45; 8= —0.02 and w=1.15. Hence, 
from (4.1), (4.2) and (4.3), the appropriate test parameters are yr= 1.9, y4=2.4 
and m=0.010+, while the associated risk, from (4.4), is fmax=1480X$.15 
= $220, which is indeed somewhat lower than in the fixed size test. 

Example 2. Suppose in Example 1 that either the new type will be accepted 
and a year’s supply ordered or the new type will be rejected and 3 months’ 
supply of the old type ordered with a re-evaluation of a modified new type 
anticipated at the end of 3 months. In this case the main loss attributable to 
rejection of a new type whose fraction of defectives is below critical is $62,500 
X(0.010—p) instead of $250,000 x (0.010—>p). p, then, is 4 instead of 1 while 
é remains equal to 12.9. 
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From Figs. 1 and 2 we read: v=0.30, F =2, w=0.80. The appropriate fixed 
test size is thus N = 390, the maximum allowable number of defectives is 2, and 
the associated risk is rmax= 1030 X$.15=$144. Note that 3 defectives out of 
390 would lead to rejection in spite of the critical fraction 0.010; the test is 
biased because of the disparity here between the losses due to the two possible 
wrong decisions. 

From Figs. 3 and 4 we obtain the appropriate sequential test. Corresponding 
to 6=12.9 and p=4 we read: Z=3.1, a=0.29; 8=0.39, w=0.67. Hence the 
appropriate test parameters are yr=0.9, y4=2.2, m=6.0093, while the associ- 
ated risk is rinax = 865 X$.15 = $130, which is again somewhat lower than in the 
fixed size test. Note how the test is biased by having not only m <p, but also 
YR<Ya. 

From Figs. 1 and 4 we see that a substantially greater disparity between the 
two possible losses, e.g., p=80, would call for rejection without a test, the 
economic advantage of a highly reliable bulb being offset by the cost of the 
necessary test, even when the initial cost of the test is neglected, as it has been 
in the optimizations presented in this paper. 


5. OPTIMUM TESTS FOR MEASURABLE ARTICLES 


Suppose now that the articles to be tested are classifiable according to some 
physical measurement x whose underlying distribution is supposed to be nor- 
mal. The fraction of defectives is assumed to be that fraction of the normal 
population which lies above a known critical limit. 

Denoting the critical limit by z, and the (unknown) mean and standard 
deviation? of the normal population by » and o respectively, the fraction of 


defectives is: 
MB Xe 
» Me as #( ), 
a (5.1) 


where (¢) denotes the cumulative normal 


a ‘ 
f a e~V2a'da, 
ee . 


If, on the basis of a sample of size N from this normal population, we form 
the usual estimates: 


Li (5.2) 


1 
N < 


i = 


and 





* The solution of a similar problem in which the standard deviation is assumed known hes recently been given 
for the special case C’ =C by Moriguti [5]. 
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t= 4 Ee (5.3) 


the estimated fraction of defectives is: 


p= a(° 2), (5.4) 


The test is to be followed by rejection if p exceeds a certain fraction pr and by 
acceptance otherwise. Again, because of the general disparity between C’ and 
C, we do not assume that pr is equal to the critical fraction p-. 

Now, for large sample numbers N, not only is 7 normally distributed about 
the true mean y» with variance o?/N, but ¢ is approximately normally dis- 
tributed about the true standard deviation o with variance o*/2N. After ex- 
pansion of # in a Taylor series about i= and ¢=c it follows easily [3] that p 
is approximately normally distributed about the true fraction p with variance 


1 
wl +- <)e (u), 


where u=@—(1—p), ie., 1—p=(«u). In the neighborhood of the maximum 
risk u may be replaced by u,, i.e., by -'(1—p,). Using these facts, the optimum 
fixed size test parameters N and pr and the maximum risk rmx are given by: 


C +. thee 2/38 2 1/3 
| ise ( 2 ) i( ‘4 ; )arcu0 nls), 


2 4+ e ”? % 1/3 
Pr = Pe — ‘ ai J Br1(p), 
| 





C+C’ 


Y \ 2/8 2 3 
Tmax = (- TS) {(: + . Yoru] wi1(p), 


@'(u.) = + en (1/2)u * 


and u, is given by: 


&(u.) = 1 — p., (5.6) 


and where vz;(p), 81(p) and w;;(p) are the same functions shown in Fig. 3. 

A corresponding sequential test is available if p is calculated at each stage N 
and the product Nf, = Ny say, plotted in place of f (Diagram B) against NV. 
The optimum test parameters and the maximum risk are given by: 
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Cc C Me 2 2/8 
= ( - ‘ [(: + ~-) wu.) | ene), 
C+o\n 2 fs 
Ys = (=) [(1 + ~ arcu | na(p), 


(2 + u2)@(u,) 7p 
m= p.—- [Se Jao) 


C+ON\M us’ we 
oo “ (> - (1 + 5) Janu.) wi(p), 


where nr(p), na(p), 8:(p) and w;(p) are the same functions shown in Fig. 6. In 
using this sequential test some lower bound should be set on N since the 
optimi:iation is based on the normality of the sampling distribution of jy, valid 
only if .V is large. A suggested lower bound is one-quarter of the optimum fixed 


test siz: 
1 Cc + Cc’ 2/8 ue 1/3 
N2 ~(—) (2 + “:.) wu.) | vui(p), (5.8) 








since, as indicated in reference [1], the average sample number for the optimum 
sequential test, when the normal approximation is applicable, is only about 
20% less than the optimum fixed test size, and consequently the lower bound- 
ary (5.8) may be expected not seriously to interfere with the asymptotically 
optimum character of this sequential test. 

Example 3. Suppose in Example 1 that it is believed that any defections of 
the new type of flash bulb can only be due to insufficiently low pressure inside 
the bulb. Suppose that there is available (at negligible cost!) an apparatus for 
measuring the pressure in each bulb, ire bulb being expended in the process. 
Suppose finally that the rather dubious assumption is made that the internal 
pressures are distributed normally, the mean and standard deviation of this 
normal distribution being unknown, while the critical pressure, above which 
the bulb will not function, is known. This means that definite statistical in- 
formation is obtainable on the fraction of pressures above critical not only from 
the numbers of bulb pressures above and below critical but also from the 
spacing of these pressures. 

Here the z,’s are the internal pressures while x, is the critical pressure. Using 
C=C’ =1.67 X10° and p,=0.010, so that p= 1 and u,.=#~'(0.99) = 2.33, we read 
from Fig. 3: »77=0.193, 81 =0, w1=0.58. Hence the appropriate fixed test size 
is N =370 with pr=0.010, while the associated risk is rmax = 1120 X$.15= $170. 
It is interesting to note what reduction in risk over that in Example 1 is 
associated with the additional statistical information implied by the assumption 
of a particular type of distribution of internal pressures. 

From Fig. 6 we obtain the appropriate sequential test. Reading n= 4 
= 0.43, 8; =0, w =0.49, we obtain the test parameters yz = ys = 0.96, m =0.010, 
and the associated risk fnax= 950 X$.15=$140. In accordance with (5.8) the 
sequential calculation of Jy should start at N = 93. 
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ANALYSIS OF SENSITIVITY EXPERIMENTS WHEN THE 
LEVELS OF STIMULUS CANNOT BE CONTROLLED 


AspRAHAM GOLUB AND FRANK E, Grusss 
Ballistic Research Laboratories 
For sensitivity experiments, a method employing the customary prob- 
it theory is given for estimating the mean and standard deviation of 


an assumed cumulative normal distribution for the case where the levels 
of stimulus cannot be controlled precisely. 


INTRODUCTION 


XPERIMENTAL investigations frequently deal with continuous variables 

which cannot be measured as such in practice. For example, in testing the 
sensitivity of an explosive to a stimulus, say shock, the usual procedure is to 
drop a specified weight onto samples of the explosive from various heights. 
Obviously there are heights at which a sample will detonate and others at 
which it will not. It can be assumed, however, that all samples which did not 
detonate at one height would have detonated had the weight been dropped 
from a sufficiently great height. Thus, each sample can be said to have a 
“critical height” at or above which it will detonate and below which it will fail 
to detonate. Therefore, the “critical height” is a continuous variable which 
cannot be measured precisely, since all one can do is apply the stimulus (drop 
the weight) from a given height and record whether the particular specimens 
under test did or did not detonate. 

Such experimental situations exist in many other fields of scientific endeavor, 
notably in biological, pharmaceutical and psychological research. The treat- 
ment of such data has been dealt with in considerable detail by C. I. Bliss [1], 
C. West Churchman [2], the Statistical Research Group, Columbia University 
[4], and the “up and down” technique has been studied by the Statistical 
Research Group, Princeton University [3]. All these treatments, however, as 
far as the authors are aware, apparently deal with situations in which the levels 
of stimulus or test can be preassigned accurately or controlled precisely. There 
is a class of sensitivity experiments for which the levels of stimulus can neither 
be assigned precisely in advance nor controlled. That is to say, we may aim for 
a level of stimulus but will not hit this level precisely because of random or 
uncontrollable variations in the stimuli level themseives. In other words, we 
have the problem of estimating the dosage response curve when the conditions 
of the probit model all obtain, but the observations cannot be taken in groups 
at fixed “dose” jevels as is usual in many applied problems. A typical situation 
of this type is the one of determining that velocity at which an armor piercing 
projectile will penetrate a given thickness of armor plate. For this case, a veloc- 
ity, for example, of 2000 f/s may be aimed for (by adjusting the weight of the 
propellant), but, due to a random distribution of velocities for a fixed charge— 
the standard deviation in observed velocities for a fixed charge may be for 
example, 15 f{/s—the velocity actually obtained may be, say, 2015 f/s instead 
of the desired 2000 f/s. Thus, it is not possible to control velocity such that, 
for example, 5 projectiles could all be fired at precisely 2000 f/s. In such a 
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situation the velocity for which practically no penetrations occur may be, for 
example, 1960 {/s and the velocity above which practically 100% penetration 
oceurs may be 2100 f/s—i.e., the zone of “mixed,” results extends over a veloc- 
ity range of 200 f/s. Thus one has to vary the propellant charge weight to ob- 
tain velocities over the range of 200 f/s, but by doing this he encounters for 
each weight of charge a (uncontrollable) random error also since the actual 
observed velocities for a fixed charge are subject to a standard error of, say, 
15 f/s. Moreover, in many practica] problems, the available sample size is so 
small (because of costs, etc.) that only one observation for each of several levels 
of stimulus may result from an experiment. This paper is therefore concerned 
with the analysis of sensitivity data for such classes of problems, which it ap- 
peared should deserve some special emphasis. Although the method is primarily 
for the rather special, small sample case, it does possess some generality since 
it may be used jrrespective of whether there are one or more observations per 
level. ~  / 

In iG siecle below, the critical levels of stimulus are assumed to be 
normally distributed and the method of maximum likelihocd is employed to 
obtain estimates of the mean critical level and the standard deviation of the 
critical levels. Also, asymptotic approximations for the precision of these 
estimates are developed. 


METHOD FOR ESTIMATING THE MBEAN CRITICAL LEVEL AND STANDARD 
DEVIATION IN CRITICAL LEVEL 


As a result of test we have an observed set of generally distinct levels of 
stimulus, l;—perhaps only one observation per level—and a statement for each 
level that the response was (either) a “success” or a “failure.” If the true proba- 
bilities of a successful response at the resulting levels are p;, the probability of 
the observed set may be written in the usual form 


P = I] pig (1) 


where 6;=0 or 1 for the different levels of test, /;, depending upon whether an 
unsuccessful or successful response was observed. It is to be noted that we 
depart from the customary treatment where there are, say n;, successes and 
f; failures in n;+/f; trials at a preassigned level, /;. In our case, we almost always 
wind up with only one observation per level and hence at each resulting level 
we have only a success or a failure. Here, 4; is a random variable which takes on 
the value 0 or the value 1, as observed, but whose expected or mean value is 
E(é,;) = Pi. 

As noted above, it is assumed that the critical stimuli are normally distrib- 
uted. Now 

& 6] ‘ L— uy 

Di -f —— edt = 1 — q; where t; = » and 


is g 


» and o¢ are the unknown mean critical stimulus and standard deviation in 
critical stimuli, respectively, which we wish to estimate. 
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To estimate the parameters » and o, employing R. A. Fisher's method of 
maximum likelihood, it is customary to maxiinize the logarithm, L, of the like- 
lihood function P, ie., L=log P, or . : 


L = X, {6s log ps — (1 — 44) log gs} . (2) 
In order to maximize L we equate to zero the partial derivatives of L with 


respect to u and oc. The solution of these equations yields the desired j und ¢. 
The carat is used to denote an estimate of a population parameter. Since 
24 Op; tes Oqs 
—=—; —= -— and 


, 
¢ 0a o 


1 2 
= 8 5 


Vln 


the maximum likelihood equations may then be written as 


-—E{a- 9-55 =0 
oO ¢ qi 


t 


éL 
Ou 


OL tZ; 
<= =F fa - 0) a, =} = 0 (4) 
00 qi Pi 


For a given set of observed stimuli, however, we may substitute the actual 
4; (ie. 0 or 1) and thus equations (1), (2), (3) and (4) may conveniently be 
rearranged and rewritten for computational purposes as follows: 


nS m 


P= II Pr I q. where (1a) 


r=] 


n=number of successful responses and 
m=number of unsuccessful responses 


L = > log p, + >. log qs (2a) 


rel =! 


1 2, r : 
~~ [y= y= ]~0 (3a) 
. Pr 


Ou og qs r 


lds 8, 
x * — E ] =o. (4a) 


de ‘7 he > 


In order to solve 3(a) and 4(a) for 1 and ¢ we wil! employ the familiar Newton- 
Raphson criterion procedure. The procedure is particularly effective provided 
that close first estimates, i, and ¢, can be found. In this connection, if a con- 
siderable number of stimuli are involved then the use of normal probability 
paper is expeditious in obtaining first estimates. If the stimuli are few in number 
a graphical first estimate may be obtained by plotting solutions of (3a) for vary- 
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ing o and solutions of (4a) for varying yu. The intersection of these two curves 
generally yields a good first estimate. It should be pointed out, however, that 
in some instances the Newton-Raphson method may not lead to a solution, as 
in the case where the solution is in a neighborhood such that the determinant 
of the Jacobian is small, approaching zero. In such instances some other itera- 
tion procedure may be used, e.g., the method of false position. 

The Newton-Raphson procedure involves a set of simultaneous equations for 
use in the iteration process. These are 

aL eL Ll 


- = Au + Ao (5) 
Ono Op” Opodo 


oL L ae ah eL m 6 
— s= rm o. ) 
Guo Apodao Oo." 


Where Au=u—p, and Ac=s—<, are the increments to be added to the initial 
estimates (u,., 7.) etc. To employ (5) and (6) it is necessary to develop the 


eL L CL 
—) —, and . 
Ou? do? Odd 


These are from (3a) and (4a) 


3 Le 1 “| tgs 
ay? o 


eL 
Ondo 


ee =. 


Pr r Pr 


[oS+-e5S-5ee-2 


s qs - qs ® q.’ r 


1,2, t,*2, 
+i — te : 


Pr 


An example of the Newton-Raphson procedure is illustrated in the example 
below. 


LARGE SAMPLE VARIANCE OF yu AND @ 


For large sample size n the variances of » and ¢ may be obtained in accord- 
ance with existing maximum likelihood theory from the expected values of the 
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second partial derivatives of the likelihood L. Considering the general equations 
(3) and (4) and the fact that Z(é,) =p; and E(1--6;) =q; we obtain 
éL 1 
B(——) =— > {2-2} =0 (10) 
Ou o i 
oL 


1 
— 2 {tas — tas} = 0 (11) 
1 2," 2;" 

= nein vo) (12) 


tz? t Py 
: ) (13) 
Di 


t,?2,? 
). (14) 
Pi 


Assuming that 


3 7) oL 
-1(f) = A(2)(2) = 
Opodo C7] 0a 


eL oL 
00? da 


then the variance-covarianee matrix may be obtained as 
hi wae Aum =Ane 
c re, 4 Ge a 
in which A” is the asymptotic variance of », A*%’ is the asymptotic variance of 


a and A” is the asymptotic covariance of u and «. 
TaBves I and II are provided so as to facilitate the computational work 


necessary in employing expressions (4) through (9). 


EXAMPLE 


In firing five rounds of a given projectile at a given armor plate the following 
observations were recorded: 


Velocity (f/s) Condition of Impact 
2433 Non-Penetration 
2415 Non-Penetration 
2415 Non-Penetration 
2453 Penetration 
2423 Penetration 
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TABLE I 


RATIO OF THE STANDARDIZED NORMAL DENSITY FUNCTION TO 
THE CUMULATIVE STANDARDIZED NORMAL DISTRIBUTION 


2(t)* 
p(t) 


-04 
0.773 
0.711 
0 652 
0.595 
0.540 
0.489 
0.440 
0.394 
0.351 
0.310 
0.273 
0.239 
0.207 
0.179 
0.153 
0.130 
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RESRERLSRSLETELTTRGTSLLREE SET RIESE 
N oO oO oO 


ccc tcddmeedaaadn tuadea oot 


: thie -Seeqqaegeeeeaeoenaage 
eoocooceoco Se tt et me te N 


For negative values ¢, p and g are interchanged, that is 
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As a first estimate we use u.= 2434 ¢,=17. In order to employ the iteration 
procedure we set up the following tabulation (utilizing the appended tables) : 


y 2 2 
Noa Pen. t; t,? t,3 s » 
Pen. q# pe 


q.'8 2433 — .118: 0130: — 001 § . .5242 
2415 —1.176 1.3830 —1.6264  . -0515 
(2415 —1.176 1.3830 —1.6264_. -0515 


2453 1.059 1.1215 1.1877 
2423 — .706 .4984 — .3519 


oL eL 
— .3830, o-( ) = .0132, - (=) = — .1389, 
da /,. Ou? o 


eL L 
o-( ) = .0805 and -(—) = — .1161. 
Ondo o 00° oO 


Employing the Newton-Raphson method of iteration, we obtain our second 
estimate of the parameters by solving for Ay and Ac in the following set of 
equations. 


3830 = — .1389Au + .0805Ac 
—.0132 = .0805Au — .1161A¢c 


Solving for Au and Ao we get Au= —4.5 and Ac= —3.0. Our second estimate 
thus becomes f= 2430.5 and ¢,=14. Repeating the above procedure twice 
more we finally obtain the solution 7 = 2431.6 f{/s and ¢= 15.0 f/s. 

Employing the solution obtained, we obtain the approximate asymptotic 
variances of u and ¢ by computing the following: 


eL 
B( ) = — 01011 
Ou? 


aL 
B( ) = .00330 
Oude 


aL 
#(=—) = .00751- 
dc? 


Aw = 01011 
Aye = — .00330 
Ac =  .00751- 
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The variance-covariance matrix is given by 


( 01011 meee! guns ma) 
—.00330  .00751/. \ 50.8 155.5 


and so 0;7=115.5 or o;= 10.7 f/s and o;*= 155.5 or 0, =12.5 f/s. 


SUMMARY 


For the class of sensitivity experiments in which the levels of stimulus can 
neither be preassigned nor controlled, it is possible to determine , by the method 
of maximum likelihood, estimates of the parameters » and o of the assumed 
underlying distribution. Also, asymptotic variances of these estimates can be 
obtained. Tables are appended hereto to aid in estimating parameters for the 
special case considered here. It is important, however, that in the execution of 
a sensitivity experiment care be taken to insure a “zone of mixed results,” i.e., 
a range of stimuli which will yield an overlap of successes and failures. In the 
absence of such a “zone” the method of a maximum likelihood will not yield 
an unique solution. Probably the best way to obtain a good “zone of mixed 
results” would be the use of an “up and down” [4] method of test, if possible. 

Recently, due to frequent requests for analyses of the type sensitivity data 
mentioned here the Computing Laboratory, Ballistic Research Laboratories, 
Aberdeen Proving Ground, Maryland, has coded the solutions developed in 
this paper for use with electronic computers. As a result, solutions for many 
sets of observed data are accomplished and recorded in a matter of a few 
minutes. 
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A NOTE ON UNIFORMLY BEST UNBIASED ESTIMATORS | 
FOR VARIANCE COMPONENTS 


FRANKLIN A. GRAYBILL AND A. W. WorTHAM 
Oklahoma Agricultural and Mechanical College 


fe purpose of this note is to state a theorem (Thecrem I below) concerning 
uniformly best unbiased estimators of variance components. 
Consider the linear model given by 


op BN —"?P + A,,@ + A,,® + ALi + oer + Cnyng++-np (1) 


where Ya,n,---n, i8 the observation, » is a fixed unknown constant generally 
ealled the over-all mean, the An,™, An,™, +--+, @njng---n, are independent 
normal variables whose means are zero and whose variances are o;°, 0:7, - - + , 0? 
respectively (¢, distinct). This model is commonly called a component of 
variance model and sometimes referred to as the Eisenhart Model II [4]. For 


example, if the model is a two-way balanced classification it can be written as 
Yu = ptazy + bj + ey a= 1,2,-+-,m;j = 1,2,-+-, (2) 


where the a,, b;, and e,; are independent normal variables whose means are zero 
and whose variances are g;’, ¢3*, and o3 respectively, and corresponding to the 
notation in (1) it follows that a;=A,,™ and Ag, =),. 

In models such as the ones above, it is desired to estimate the a? or linear 
combinations of the a7. This can be done by partitioning the corrected total 
sum of squares }>(Yajn---n,—Y---)*, into t non-negative quadratic forms 
Q:, Q2, - > >, Qe such that (where y... is the mean of Vajng---n,) 


Lo (Yosngs+-mp — Fore” Sade (3) 


j=l 


If f; represents the degrees of freedom associated with the sum of squares Q,, 
and if L,; represents the expected value of Q;/f;, then L, is a linear function 
of the o7 and the partitioning, which is called the analysis of variance, can be 
put into a form as illustrated in Tasue I. 


TABLE I 


Source of Degrees Expected 

Variation Freedom Mean Square 
Total Li (Yayny--n, —9-->)* fo 

due to A® Q: hi L, 

due to A® Q: ts Ly 


Sum of Squares 


due to error 
(Cnyng- . om) 
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The L, are linear functions of the ¢? only (they do not involve yu). ‘The equa- 
tions 
Qi/fi = Ls t#=1,2,---,t (4) 
are solved for the of (j=1, 2, - - - , é) and these solutions are used as the esti- 
mates. This procedure is known as the analysis of variance method of estimating 
variance components and has been discussed by many authors [2], [3], [5], [6]. 
For the special case of the two-way classification model defined by (2), the 
partition is indicated in Tasue II. 
TABLE II 


Source of Sum of Degrees Expected 
Variation Squares Freedom Mean Square 


Total u (Yi; — y..)? mym, — 1 
ij 
duetoA“ ora; Q = >> (ys. — y..)? m —1 oe + moi" = 14 
ij 
due toA® orb; Q: = 2 “f 5 m; — 1 o3* + mioz* = Lg 


due to error ; 4 (m — 1)(m— 1) a3? = Ly 


Corresponding to (4) the equations 


s = a3" + m2o;*, 
m— i 
Q2 Qs . 


= o3' + mo", and = 03 
me — 1 (m, — 1)(m, — 1) 





are solved for o,’, o3*, and o} and the solutions are taken as the estimates. (In 
the sum of squares, if a dot replaces any subscript in y,; it indicates the mean 
of the Y,;; when summed over the subscript which is replaced by a dot.) 

It is well known that the analysis of variance method of estimating variance 
components as given in TABLE I for model (1) (and for the special case of model 
(2) in Tasue IT) produces unbiased estimators of the o7 and unbiased estima- 
tors for any linear function of the o7. However, almost nothing has been said 
about how “good” these estimators are. In many important cases a strong 
statement regarding “goodness” of variance components estimators can be 
made. 5 
The following theorem, called the complete class theorem for sufficient 
estimators, is well known: Let 2:, 22, - + ~-, ¢, be random variables with joint 
density 


Sle, 2, +++, @n; 1, Oa, > + +, O) 


where @; are unknown parameters. Let t;(21, 22, - - + , 2n) =t(t=1, 2,-++, k) 
be a set of jointly sufficient statistics for the 6; and suppose the t; are complete. 
Suppose further that it is desired to estimate some function of the 6;, say 
h(@:, 02, - - - , 8,), for which an unbiased estimator, say g(z:, z2, - - « , Zn), exists. 
Then there exists an unique function of the t; which is an unbiased estimator of 
h(x, 02, - - : , 8,) and this estimator has uniformly the smallest variance of any 
unbiased estimator [1], [7], [8]. 
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By applying this theorem the following theorem can be established: 
Theorem I. If the subscripts n;, ns, - - - , n, in the analysis of variance model 
(1) are such that the quantity 


Do (Vareg-+-04 — Ye)? 
can be partitioned into non-negative quadratic forms Q:, Qe, - - - , Q; as indi- 
cated in Taste I such that 

(a) Q:/L; (i=1, 2, - - +, é) are distributed independently as chi-square with 

J; degrees of freedom respectively, 
(b) The L; are linearly independent linear functions of the o7, so that the 
equations Q;/f;=L; (t=1, 2,---+, #) have unique solutions for the 
a? (j=1, 2, a , 4), 
then the uniformly best (minimum variance) unbiased estimator of any linear 
function of the L; is given by the same linear function of the Q;/f;. That is to 
say, if the conditions of the theorem are true, then the analysis of variance 
method of estimating the variance components, ¢7 or any linear functions of 
the o,;, is unbiased and no other method of estimating variance components 
gives unbiased estimators which have smaller variance. Thus we say the 
estimators given by this method are uniformly best unbiased estimators. 

The proof of the theorem is outlined as follows: Since the Q; are a sufficient 
set of statistics for the L;, and since the Q; satisfy the completeness property 
[7], and since condition (b) insures that unbiased estimators exist for ¢7 and 
for any linear function of the «7, this theorem is just a simple application of the 
complete class theorem for sufficient estimators. However, this statement con- 
cerning the “goodness” of variance component estimates has not appeared in 
the literature, and since variance components are becoming increasingly im- 
portant in many fields of experimentation, it seems as if this theorem should 
be printed. 

The analysis of variance models for which the conditions of this theorem 
are satisfied are all the so-called “balanced complete” models when the com- 
ponents are independent normal variables. These include randomized complete 
block, Latin square, the factorial models, the nested classification, split-plot 
models, and many other components of variance models, 
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SOME ESTIMATORS IN SAMPLING WITH VARYING 
PROBABILITIES WITHOUT REPLACEMENT 


Des Ras 
Indian Statistical Institute 


The problem considered is estimation of the total value of a character 
for a finite population from a sample when the units are selected with 
varying probabilities without replacement. Several unbiased estimators 
are proposed. Exact expressions and unbiased estimators for the vari- 
ances of the estimators are obtained. Certain properties of Yates and 
Grundy’s estimator of variance are proved. Examples are included 
to study the relative performance of the various estimators. The re- 
sults obtained for unistage sampling in the first part of the paper are 
extended to multistage designs in the second part. 


PART I 


1, INTRODUCTION 


T 1s well-known that by assigning varying probabilities of selection to dif- 

ferent units in a population, it is possible to reduce considerably the sampling 
error of the estimates over those obtained when sampling with equal prob- 
abilities. Recently, Horvitz and Thompson [4] have proposed an unbiased 
estimator for estimating the total of a finite population and have also estimated 
the variance of their estimator when sampling is carried out without replace- 
ment with varying probabilities at each draw. Their estimator is 


gy 


1 WF; 


Yur = (1) 
where 7; is the probability that the ith unit in the population enters the sample 
of size n. Also 
a ae N Wij — Fe; 
VGur) = y? + 2 ——— yas (2) 


1 Ti j>i=l WW; 


— %; 


and an unbiased estimate of the variance is 


i 2 1 — m; -. Ma Wes 
Var) = Dd on 8 ~— Yas (3) 


1 Ty j>imWl = WitsNij 


where 7;; is the probability that the 7th and jth units in the population enter 
the sample. One serious disadvantage with the estimator (3) is that it may 
assume negative values. For example, when a sample of size 2 is taken from 
a population of four units given by 


"= 1, Y2 = 2, Ys = 3, w=4 


such that mi2= 134 = 2, 71s = Tia = Tos = Tu = 1/16, it is easy to see that the estima- 
tor (3) would be negative for all samples except for (yj, yz) and (ys, ys). In such 
cases the estimate (1) becomes useless since its variance cannot be estimated 
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from the sample. Yates and Grundy [6] have proposed an alternative estima- 
tor for (2) which is believed to be less often negative. They recast (2) into 


N 7 \2 
Dd (wim; — 443) (= _ 4) (4) 


j>i=1 Wi Wj 


so that their estimator is 


* -_ ; P A\2 
Vre(yar) = a =e ~- *) ‘ (5) 
j>i=l Wij wi Tj 

They remark that “although it is not immediately apparent that (x,7; 
—74;)/my is necessarily. positive, this appears to be the case when the usual 
method of selection is employed.” It is easy to see that the estimator (5) can 
be negative. For instance, in the example considered earlier (5) is negative 
whenever (3) is positive. We thus see that the estimators of variance given by 
Horvitz and Thompson [4] or by Yates and Grundy [6] can take negative 
values. This disquieting situation led the present author to search for estimators 
whose estimated variance should be always positive. Such estimates and 
others have been presented in this paper. In the scheme considered by previous 
authors there is only one unbiased estimator and that too can be negative. We 
have given a whole class of estimates whose estimated variance is always posi- 
tive. It is not claimed that this estimator is necessarily mere efficient than 
the estimates presented earlier, although this has been found to be the case in 
several examples. One such example is that given by Yates and Grundy 
themselves. 

2. PROBABILITIES OF SELECTION AND 
THEOREMS ON EXPECTATIONS 


Let there be a population 


Ww, U2, aes | UN (6) 


from which a sample of size n is drawn without replacement with varying prob- 
abilities of selection at each draw, the scheme of selection of a unit at a par- 
ticular draw depending naturally on the units already drawn in the sample 
but not on the order in which they were drawn. For the first draw the prob- 
abilities of selection are: 


{pa}, «= =1,2,--+,N), paor>Q Lpa=l. (7) 
At the second draw we have (*) sets of probabilities of selection 


{pa'}, {pa*},---, {pa} 


according as the first, second, . . . Nth units are drawn at the first draw. And 
so on for other draws. At the nth draw we have (,",) sets of probabilities of 
selection depending on which of the n—1 units have been drawn at the previous 
draws. Let the sample values obtained in order according to this general 


scheme be 


(" mts) (9) 
Pa P2 Pin 
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where the superscripts of p;; have been omitted. Before proceeding further 
we will present two thecrems which will eonsiderably simplify our derivations. 
‘Let a sample of size n=2 be drawn with varying probabilities of selection at 
each draw from a population of size N. Let f(w;, y;) be a funetion of the sample 
values drawn. By £,.(f) we shall denote the conditional expected value of f for 
a given value of the first draw and similarly V2(f) denotes the conditional 
variance of f. In the same way E£,(f) and V,(f) denote expected values and vari- 
ances for all possible units obtained at the first draw. Then it is easy to prove 
the following: , 

Theorem 1. The expected value of f is equal to the expected value of the con- 
ditional expected values, i.e., 


E(f) o Ex(f) - E,E.(f). (10) 


Theorem 2. The variance of f is equal to the sum of the expected value of 
the conditional variances and the variance of the conditional expected values, 
i.€., 


V(f) = Vio(f) = (AiV2 + Vils)(f). (11) 


The two theorems are capable of an easy generalization for any sample size n. 
In fact 


Ey... .n(f) be E\E, boils Shag E,{f), (12) 
Vig. -n(f) = (E\ Vos. .-n + ViEx..-»)(f). (13) 


3. ONE SET OF ESTIMATES 


In order to estimate the total Y of a character y for the population (6) for 
the sampling scheme given in Section 2, one set of proposed estimates is 


h = :/Pa 


become ciDaick is eatin Poceettior: - 
Pin 


Using theorem 1, we have EH(t,)=Eiz....(t,.)=EiE2- + -En(t,). Since 
E(YnPin) = ¥ —(yityet + -* +Ya-1), we have E(t,) = Y, so that th, te, - ++, tn 
are all unbiased for estimating Y. Hence any linear function 


i= > Cid, } (15) 


1 


is an unbiased estimate. To obtain the variance of t, we use theorem 2 and have 
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Vitz) = Lipa Depa: DL “a 
Pin (16) 
- > pa > Ea * > Peaal(Y — ym: Ye)’, n> 


where summation is always over the available units. 
An application of theorem 2 gives the very pleasing result 


Clb.) = 0, (Aw) (17) 


where C(t, t,) denotes the covariance between ¢, and ¢,. 

The result (17) shows that the estimates t,, f, - - - , t, are uncorrelated. The 
variance of t, the general linear unbiased estimate in this set-up , can then be 
easily obtained as 


Vi) = > e#V(t). (18) 
1 
In particular for the very practical situation n=2, we have 


y y 
-aX+e(n +), (19) 
Pa Piz 


; : | gfe? : <2” F 
V(t) = ec? > —- ¥? + ¢’ > pa 2D » palt — yw)? |}. 20) 
Pa Pi 


j2 


For any choice of ¢, ¢, - ++, ¢, such that their sum is unity we get an un- 
biased estimate of Y given by (15). It will be seen that the estimate for which 
the coefficients are all equal has a very desirable property. If one seeks for an 
estimate which should reduce to N times the sample mean when the selection 
probabilities are equal, it is easy to see that the coefficients co, c2:, --- , ¢, 
should satisfy the following equations: 


No + ¢c2 + +++ +e, = N/n, 


N — lee+++++, N/n, 


N—n+ le, = N/n. 
We now come to the problem of estimating the variance of the estimator (15). 
Making use of the result (17) that 4, f&, - - - , i, are uncorrelated, we have 
F(ji,) = Y? for \ # wu. (22) 
Hence an unbiased estimate of the variance of ¢ given by (15) is 


‘ DY’ tte 


TO 2@e n(n — 1) 


where >>’ denotes summation over the (3) pairs. In case q.=c:= +++ = 
so that the estimator is 
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1 n 
tnean = — > ti, 
nm 1 


we have from (23) that 


( Eu) , 2 tty _ 1 gu Ed (25) 


n? a —1) n(m-ILiG 





Vi) = 


which is evidently always positive. In view of the fact that the estimators so 
far presented can be negative, the estimator tnean seems to be a very interesting 
estimate. 
4. A SECOND SET OF ESTIMATES 
We shall now give another set of estimates as 
th’ = y:/pa 
1 1 Ye 
t,! — = 
N — 1 pa De 


1 1 Yn 


t,’ = —— —— «© 2 «© 


(N — 1)(N — 2) -(N—n+1) pa Die Pin 





Using theorem 1 we have 
E(t’) = E( ts’) = + +e = E(tn’) im Y 


so that 4’, ’, - - -, t,’ are unbiased for estimating Y. Hence 
» Se, Baw (27) 
1 


is an unbiased estimate. For the vari..nce of t,’, an application of theorem 2 
gives 


Vit,’) = _— —E sellin 


[((N — 1)(N —2)---(N-n4+)Df? Pr 





_— y2, n>l 


while the covariance between 4,’ and t,’(u>A) is given by 


1 
C(ty’t,’) = 2 > = 2: 


N — \)(N — +1)? -- - (WN — 1)? Piz 





(29) 
4 
ellragl > 


Pm 


where >> always denotes summation over the available units. Using (28) and 
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(29) the variance of t’ can be easily obtained. In particular for n=2 we have 


1 1 
Fine = ne a (30) 
Pa N or, 1 Pa Pir 


3 1 1 
Vth" a bet SP Bik eflisieee > —-—>d 
Pi Pa 


a (N -'1)* 


Yo? 

Pr 
+ eres 

N-1 

It is of interest to note that ¢’ reduces to N times the sample mean if 


1 
(32) 


-_= (, *=-->: 


n 


; eee 
y= (Y — 9) — Te 
a 


Cy = Co = + 


We now turn to the question of obtaining an unbiased estimate of the 
variance of t’. We notice that 
(33) 


Nn 
E(yt,’) = pe Ye? (i= 1, 2, # oa n), 
1 


(Gj >i=1,2,---,n). 
E(@) = ¥? 


) N-1 2 
G= ¥ Dd vit’ + . Dd yit;. 
1 int 
> © 


Thus an unbiased estimate of the variance of ¢’ is given by 
Vit) = t? -—G. 


This estimate may assume negative values. 


5. A THIRD SET OF ESTIMATES 


We shall now give another set of estimates for Y. From the set of selection 
probabilities given in Section 2, we can calculate the unconditional probabilities 
with which the units will be drawn at each draw. Suppose 


P(t 1, 2,-°*,N; 7 = 1,2,--+,) (38) 


is the unconditional probability that the ith-unit (in the population) will be 
drawn at the jth draw. Then the proposed set of estimates is 
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i” = y:/Pa 


.” = Yn/P in. 
It is easy to see that E(t,’’)=E(t,"")= --+ = E(t,’")=Y, se that 


t" = > chi’, >” C= 1 


1 1 
is an unbiased estimate. Also 
Yn" 
Vit,!’) = — Y¥? 
X Pin 


where >_ denotes summation over all units in the population. And 


Cia"t”) = Spad::: Dpaad > a. ae 
yr 


(42) 


: LD Pow p P 


Hence the expression for the variance of t’’ is apparent. In particular for n =2, 
we have 


Pun — Y*. 


Yn 
un 


(43) 


* Ny? Ny? Yo 
V(t") = ot 2) + ot Do + Dees Ds Lo pa — ¥*. (44) 


t=1 ca tel 2 ivi 72 


To obtain an estimate which reduces to N times the sample mean when the 
probabilities of selection are equal, it is easy to see that in this case 
1 


TAT Tes 2°: Ee 


1 
OQ =Cc=t=rer’' PO, = —- 


n 

To estimate the variance of t’’ we make use of the fact that 
E(t.) = Y* (0 # wu) 

so that 


2 >’ byte 


V(t’) = f/2 — a “e D 


where >.’ denotes summation over the (3) pairs. 
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This estimate may assume negative values. 
As stated before, explicit expressions for the unconditional probabilities can 
be worked out in the general case. For example when n=2, we have 


Pa = Pay 
(47) 
Pe = Pup + Pupir* + rest Pi-sPa* + Pin spat + +++ Pyipi® 


where p.»' is the probability that the ith unit is selected at the second draw when 
it is known that the ¢th unit has been selected at the first draw. In case the 
second unit is drawn with probabilities proportionate to size (pps) of the re- 
maining units, we have 


Pa = pi (say) 
g Pi Pi ) (48) 


Pa=~( D - 


ms l— py 1 — p; 





If, however, the second draw is made with equal probabilities, we have 


Pa = pi, 


1 
Pe = (1 — pi). 


N-1 


6. SOME ASPECTS OF YATES AND GRUNDY’S ESTIMATOR 


Before studying the relative performance of the estimators proposed, we shall 
turn our attention to the remark made by Yates and Grundy [6] that “although 
it is not immediately apparent that their estimator is necessarily positive, this 
appears to be the case when the usual method of selection is employed.” We 
shall prove that this estimator is positive in at least two important situations. 

(t) When the first unit is selected with probabilities proportionate to some 
measure of size (pps) and the remaining units are selected with equal proba- 
bility. 

(ti) When the first unit is selected with probabilities proportionate to some 
measure of size and the second unit with probabilities proportionate to the sizes 
of the remaining units, the sample size being two. 

We will present two theorems incorporating the results mentioned, which 
have also been noticed by Sen. 

Theorem 3. In sampling with varying probabilities when the first unit is 
selected with pps and the remaining units with equal probability, Yates and 
Grundy’s estimator of the error variance is ulways positive. 

Proof: If the first unit is selected with probabilities 


Pi, Pa, ***, PN; LDL p= 1 (50) 


and the remaining n—1 units with equal probability without replacement, we 
have 
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“= a (pi + += | 
10 ie 24 Lae Pi DP; 2]? 





—n n—1 


(= Sta pay ey, 





Now Yates and Grundy’s estimator of variance is given by 


mms — Wiz (Yi y\? 
Vro= >. - sme - 4). 


j>i Wij Wi 5 
Substituting in (53) for +; and 2,; from (51) and (52), we have 


= N-n 1 


Vye = (N — 1)? > — —|@ — n)pepj + 


(ae ep 
Ti T; 
which is always positive. 

Theorem 4. In sampling with varying probabilities for samples of size 2 
when the first unit is selected with »ps and the second unit with pps of the re- 
maining units, Yates and Grundy’s estimator of the variance is always positive. 

Proof: For the sake of definiteness, we suppose that the units selected are 
Yi and Ye. 

Then we have 


n—-1 : | 
Voa' Pi — Pi 





Pp P: 
ma = pips] 2+ : + : |. 
Loy t=—-m 


TT, = m[i+i* +A], n= ptt ” +a], 
l1—-Mm 1 — pr 


where 


A= 
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Yates and Grundy’s estimator of the error variance then is 


1 1 1—pi-Pe2 ] | 1 1 ] 
At+A _ ; 
[ Z Ce Ms —_ (1—pr)(1— pe) / 1—pr b 1—prz 


For the two units selected it is easy to see that A is minimum when 
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where 
k=1— pi — pr (59) 
so that the minimum value Axia of A is given by 
(N — 2)k/(N — 2 — k). (60) 
Hence the numerator of (57) exceeds 
(N — 1k? 
(1 — p:)(1 — ps)(N — 2 — k) 





A ata? + 


which is positive for N >2. 
Thus Yates and Grundy’s estimator of variance is positive in the two situa- 
tions described before. 


7. RELATIVE PERFORMANCE OF THE ESTIMATORS 


We shall new study the relative performance of the estimators proposed in 
this paper and those given by Horvitz and Thompson [4] and Yates and 
Grundy [6]. For this purpose we shall consider the following population given 
by Yates and Grundy 


y 
0.5 


P 
1 
2 1.2 
3 2.1 
4 3.2 


This population was deliberately chosen by them as being more extreme than 
will normally be encountered in practice. The object is to estimate the total of 
the population by taking a sample of two units. We shall study the performance 
of the estimators 


~ 


buen; by, t’, of Yar 


given by (24), (15) and (21), (27) and (32), (40) and (45), and (1) respectively. 
The discussion is restricted to the two important cases (i) and (i) given in 
Section 6. The results obtained are presented in Tables 1 and 2. As proved be- 
fore, Yates and Grundy’s estimate of variance Vy¢(fzr) is positive in both the 
cases as well as V(tmeon), Which is always positive. Horvitz and Thompson’s 
estimator Vur(far) becomes negative twice under Case (i) and eight times 
under Case (ti). Certain other estimators (of variance) given in this paper also 
take negative values. With regard to the true variances themselves of the esti- 
mates, it is found that tnean is best in Case (i) while ty is best in Case (iz). In 
both cases, however, tmean has a niuch smaller variance as compared with the 
variances of other estimators. Judged by this and some other examples con- 
sidered by the author, it appears that the estimator #,..n compares very favor-' 
ably with Yates and Grundy’s or Horvitz and Thompson’s estimator gr. It 
has besides the distinct advantage that for any sample size and for any system 
of probability selection, the unbiased estimate of the variance would not be 
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negative, while no general statement of this kind can be made for other esti- 
mators. While the intention is not to show in this paper that the estimator 
tmean iS always better than other possible estimators, the main point is that, 
given some reasonably correlated measures of size, by sampling with proba- 
bility proportionate to these sizes, the estimator tnean Will be superior from the 
point of view of possibly smaller variance, computational and algebraic sim- 
plicity ete. 


TABLE 1. UNBIASED ESTIMATES OF ERROR VARIANCE FOR CASE () 


Units ve”) ve’) Vtg) Vitmeon) Wro@ur) Vardar) 


2.06 45 .80 —1.14 .20 1.51 
15.00 114.20 4.44 .81 4. 
59.75 241. j 6.50 7. 
—1.50 4. —1.76 1. 
11.25 15. 
56.20 34. 
—6.42 .f 
—6.84 —4, 
50 .3* —13. 

—14 «4 —- 3. 
—15.34 —13. 
4,3 — 3.75 —24. 
True error 
variance 6.222 9. 3.619 2.223 2.884 
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TABLE 2. UNBIASED ESTIMATES OF ERROR VARIANCE FOR CASE (#) 


Units ve” Pe’) V (ty) Vitmeen) Vroguar) Varur) 


—4.63 93 .20 1.86 ; —6.21 
-87 114.20 —4.69 
18.74 134.60 
— 7.59 10.85 
-23 12.02 
19.40 10.39 
10.49 — 3.17 
— 8.26 — 5.52 
19.47 —12.80 
—15.28 — 9.86 
—12.32 —13.15 
— 4.36 —17.01 
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PART II 
8. EXTENSION TO MULTISTAGE DESIGNS 


Introduction. In the first part of this paper we have obtained some estimators 
in unistage designs when the units are selected with varying probabilities with- 
out replacement. Since large scale surveys are generally based on multistage 
designs, it would be appropriate to extend the estimators of the first part to 
such designs. The use of sampling with varying probabilities in subsampling 
designs was first suggested by Hansen and Hurwitz [2]. They found out that a 
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subsampling design, in which only one first stage unit was selected from each 
stratum with probability proportionate to the measure of the size of the unit 
and a fixed number of second stage units was selected with equal! probability 
from each of the selected first stage units, brought about marked improvement 
in the precision of the estimate as compared with sampling systems involving 
the use of equal probabilities. In another paper Hansen and Hurwitz [3] con- 
sidered the problem of determining probabilities of selection which have 
optimum properties in the sense that the schemes provide maximum efficiency 
at a given cost. They however restricted themselves to the case in which the 
first stage units are selected with pps with replacement. To avoid the repetition 
of first stage units in order to achieve gains in efficiency, Midzuno [5) developed 
the scheme for the selection of n first stage units with pps from a stratum but 
without replacement for obtaining unbiased estimates of the population total. 
Midzuno, however, did not give unbiased estimates of the variance of the 
estimates. Horvitz and Thompson [4], though primarily interested in the uni- 
stage case, considered the multistage case where the first stage units are selected 
with varying probabilities and without replacement and random samples of 
predetermined sizes are chosen from the selected first stage units. If the popu- 
lation (or stratum) consists of N first stage units of which n units are sampled 
and if for the ith unit in the population there is an estimator 7; based on sam- 
pling at second and subsequent stages such that 


E(T;) = Y; = total of the ith unit, V(T,) = o;? = E (s,*), 


Horvitz and Thompson’s estimator is 





ee e TT; 
yar = DS — (62) 
1 FF; 
whose variance is 
p Nl x; N Wig — WR; ea OF 
Vijur) = > y2 +2 > ——-yi t+ DC —> (63) 
1 Ti j>i=l Wim; y. Os 


an unbiased estimate of which is 





wie ® 1 — HW, n Rig — Gh; n 8;? 
Viger) = TO 48 Dm ets — +: AO) 
1 Fi j>iml = WiMiN ij 1 %%§ 


The estimator (64) like the corresponding one in the unistage case can take 
negative values. Durbin [1] gave an alternative estimate of the variance follow- 
ing Yates and Grundy [6] who were interested in the unistage case only. Re- 
casting the expression (63) as 


N Yi Yj 2 N o;? 

DL (win; — 4s) (= - ») +> — (65) 
j>t=1 Ri Tj 

the unbiased estimate of the variance obtained is 


By i" n Mj Fi T; T? n 3,7 
Vye(yxr) = > eed aie ~) +> — (66) 


j>i=1 Rij 
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which is believed to be less often negative. 
Summary of results obtained. Let the population (or stratum) consist of N first 
stage units 


Uy, Ur, haat UN (67) 


the total of the ith unit being Y;. From these first stage units n units are selected 
without replacement with varying probabilities according to the scheme given 
in Section 2. We subsample a predetermined number of second stage units from 
each of the first stage units selected and may proceed with further subsampling 
of the second stage units in a known and predetermined way and so on. Suppose 
that for the ith first stage unit there is an estimator 7; (based on subsampling 
at second and subsequent stages) such that 


&(T;) = Y,, V.(T;) =¢/= &2(s;7) (68) 


where & and V, denote conditional expectations and variances over the second 
and subsequent stages. 

Defining similarly &(f) and V;(f) as the expected value and variance of f over 
different samples of units at the first stage, we can easily prove: 

Theorem &. The expected value of f is equal to the expected value of the con- 
ditional expected values, i.e., 


&(f) 8 &1:2(f) - &:62(f). (69) 


It may be noted that, in terms of the operators F,, 2, ----, EZ, of Part I, 
we have 


& = E\E: a E,. (70) 


Theorem 6. The variance of f equals the sum of the expected value of the con- 
ditional variance and the variance of the conditional expected value, i.e., 


Vif) = Vi(f) = (&:Ve + ViS=)(f). (71) 
We note that in terms of our operators of Part I, 
V; = E, V2. “on + Vi Ex. oom (72) 


In order to estim.:te the total of the population (or stratum), one set of un- 
biased estimates is 


Ti/pa 


T: 
™m+— 
Piz 


T 
Zn Tit T2e+ -+++ Trait 
In 
where 7), T2, - - - , T, are unbiased estimates of the totals of the respective first 
stage units drawn in the sample in this order. 
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z=) ce, > cs = 1 is unbiased. (74) 
1 
We have 


2 
Violen) = Vilta) + & [a +o +--+ + on? + = (75) 


in 


where ¢, is the corresponding unistage estimator discussed in Part I. 
With regard to the covariance of z, and z,, theorem 5 gives 


N 
C(2r.z,) = » i a? 


so that unlike the unistage case, z, and z, are not uncorrelated. 
Considering the series of estimates 


5 = 8:°/Pa, 
8? 
v2 = 8? + —,) 


8? 
Un = 4° + oy? +--+ + Be? + 


in 


it is easy to see that 
v= > cw, La =1 
i 


is an unbiased estimate of -¥ o?. 
Hence an unbiased estimate of the variance of ¢ is 


ro 225 


j>i=1 


a + v. (79) 


V(z) =zg?—2 


In case ¢,=¢, +: =C,=1/n, it is easy to see that V(z) is positive. 

Comparing (79) with the corresponding estimator (23) in the unistage case, 
we arrive at the following important rule for estimating the variance in multi- 
stage designs. 

The estimate of variance in multistage sampling is the sum of two parts. 
The first part is equal to the estimate of variance calculated on the assumption 
that the first stage units have been measured without error. The second part 
is obtainable from the population total estimate itself by substituting the 
estimated variances for the estimates of the totals of the units. 

Another set of unbiased estimates is 
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is an unbiased estimate. And 
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whereas 
1 1 


C(z’z,') = (N-— d)(N -A\+1)?:--(N— mie Pa 
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(83) 
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which is the same as obtained in the unistage case. In order to estimate the 
variance of z’ we notice that 


N N 
&(T z,’) — 2; Y,' + > o%° (a ss, 1, 2, te phi. (84) 
k=1 k=! 


y? — » Y;? 


&(T z,/') = oY 


(j>t=1,2,---: ‘ (85) 


Considering the estimates 


v = 81°/pa 
1 1 82? 
N — 1 Pa Da 
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v= 


v,' = ait ad 
(N —1)(N —2)---(N-—n+1) Pa Pp 
it is found that 





v= > cai, Ya=1 


is an unbiased estimate of 5." a’. 
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Hence an unbiased estimate of V(z’) is given by 
V(e') = 2"? - - Lh Te’ - abot > Tz; +0’. (88) 
n (3) j>i=l 
Another set of unbiased estimates is , 
= 7,/Pa 
T2/P ia 


= T,/Pm 
where P,; is defined by (38). As before 


s"=Dea’, Lat (90) 


is unbiased for estimating Y. With regard to the variance of z,’’ we have 


2 
V(zn"’) = Viltn’’) + «( ~ ). (91) 


* 
in 


The covariance between z,’’ and z,"’ is given by (42) in Part I. The expression 
for the variance of z’’ is apparent. An unbiased estimate of the variance of z’’ 
is given by 


2 py 2p 
! n(n — 1) 


V(2") = 2/2? — 


+0 (92) 


where >_’ denotes summation over the different pairs in the sample. 
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THE EFFICIENCIES OF TESTS OF RANDOMNESS 
AGAINST NORMAL REGRESSION 


ALAN Stuart 
London School of Economics 


I, INTRODUCTION 


Te. purpose of this note is two-fold: (1) it corrects an error in [6], to which 
this note is essentially a supplement; (2) it gives a table of the efficiencies 
of tests of randomness against normal alternatives which is brought up to date 
to cover work done since [6] was published. 


II. ASYMPTOTIC RELATIVE EFFICIENCY 
In [6], a result of Pitman [5] was used for the asymptotic relative efficiency 
(A.R.E.) of two consistent, asymptotically normally distributed test statistics. 
Noether [4] has recently generalized Pitman’s result as follows: 
Denoting mean and variance by Z and V and sample size by n as usual, with 
subscripts 1, 2 referring to the statistics t,, 42, let m; be the least integer such that 


am 
Byimo = | Ett 0) | ~ 0, (¢ = 1, 2) 
. apn BIO) 
where 6 is the parameter characterizing the null hypothesis, and define 6;>0 


by 
{ Eom }2 


V(ts| B = 0) 

Then the A.R.E. of ¢, compared to ¢, is 
0 if 5 < & 
A.R.E. (th, 4) = (= 


1/mé 
*) if 5; = 5. = 6 and m, = ms». 
Ca 


~ cgnimsts, (i = 1, 2) 


The error in [6] consisted in replacing (1) by 
0 if m > 1 and m= 1 
A.R.E. (h, 2) = {¢ 
(ty, &) — if 6, = 6: #1, and m = m, = 1. (2) 
C2 


The next section gives the supplementary ¢etails necessary to evaluate (1) 
correctly for the tests described in [6]. 


III, SIX TESTS OF RANDOMNESS 
The following results were given in [6]: 


Regression coefficient (b): m=1, 6 = 3/2, 


m=1, § = 3/2, 


Spearman’s rank correlation test (V): 

Kendall’s rank correlation test (Q) } 

Difference-sign test (D): m=1, 6=1/2, 
285 





286 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1956 


The other two tests considered in [6], the turning points test (7) and the 
rank serial correlation test (W) were found to have m>1. 
It is a matter of straightforward algebra to show that, for 7, 


m = 2, 6 = 1/4, ce? = 15/(2x’). 
The computations for W are more considerable, but quite routine, leading to 
E,!’(W) ~ n*/(24n), 
E,""(W’) ~ n§/(24r) — sn*/(8x) ~ net 
where W and W’ refer to the non-circular and circular coefficients and s is the 


lag of the coefficient, as in [6]. Also, from Wald and Wolfowitz’s general vari- 
ance formula in [8], we obtain for the case when ranks are used, 


n*(n + 1)(n — 3)(5n +6) 
720 144 


and this asymptotic result holds for W and for any s. Using (3) and (4) in (1), 
we obtain for W, 
m = 2, 5 = 5/4, c/2 = 1/(4r"). 


(3) 





V(W'| 8 = 0) = (4) 


IV. FOUR OTHER TESTS 


Records test. In [3], Foster and Stuart proposed a new test of randomness 
based on the “records” established in a series of observations. An upper record 
is said to be established whenever an observation exceeds all previous observa- 
tions, and a lower record whenever one is exceeded by all previous observations. 
The test statistic, d, is the difference between the numbers of upper and lower 
records. In [7] it is established that, for d, 


m = 1, $= 1, ce? = 1. 


Median test. Brown and Mood [1] proposed this well-known test, which 
simply counts how many of the first $n of the n observations exceed the median 
of the whole set. Cox and Stuart [2] have shown that for this test, called B, 


m=1, 8=3/2, c? =1/(8x). 


Two sign tests. Cox and Stuart [2] proposed two new tests for randomness. 
The first, Si, is computed by comparing the kth observation from the beginning 
of the series of n with the kth from the end, and scoring (n—2k+1) if the earlier 
observation is the greater, and zero otherwise. This is done for k=1,2,--- $n, 
and the scores summed to form S,, which has 


m=1, 6=3/2, c= 1/(6n). 


The S, test is the most efficient of the class of sign tests considered in [2]. A 
simpler test, Ss, is obtained by comparing the kth observation with the 
(3n+k)th for k=1, 2, -- + 4n, and scoring 1 or 0 according to whether or not 
the earlier observation is greater. The central 4n observations are ignored. 
This is a simple counting operation, and the sum of the scores, S;, has 


m = 1, é = 3/2, c? = 4/(272). 
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S; is the most efficient sign test using equal «..shts of the class considered 
in [2]. 


Vv. COMPARISON OF TEST EFFICIENCIES 


Using the results outlined in sections III and IV in equation (1), we obtain 
the following consolidated table for the asymptotic relative efficiencies of ten 
tests of randomness against normal regression alternatives: 


Asymptotic 
Test Values of relative 


48 a efficiency 
6 1/12 1 
1/(4x) (3/r)!* = .98 
1/(6x) (2/x)"? = .86 
4/(27x) {16/(9x)}* = .83 
1/(8x) {3/(2x) pitas -78 
1/(4x*) 0 
1 0 
3/x 0 
15/(22*) 0 


In virtue of (1), a test has A.R.E. of zero compared with any other test for 
which 6 is greater. Thus orders of magnitude in efficiency separate the last four 
tests in the table. Each of them has A.R.E. of zero compared to any test ap- 
pearing above it. 


Regression coefficient test (b) 
Spearman’s rank correlation test ‘a 
Kendall’s rank correlation test (Q) 
Weighted sign test (S;) 

Unweighted sign test (S;) 
Rrown-Mood median test (B) 

Rank serial correlation test (W) 
Records test (d) 

Difference sign test (D) 

Turning points test (7) 


woe eKe Nee eS - 3 
-NRAIOOa @ 
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A NOTE ON MATRIX INVERSION BY THE 
SQUARE ROOT METHOD 


Davin DurRaNnD 
Massachusetts Institute of Technology 


HE square root method, also known as Choleski’s method in the non- 
"Tl cohen literature, has been discussed recently by Duncan and Kenney 
[1, pp. 16-29], Dwyer [4; 5, Sec. 6.5, 13.4, 13.7, 13.8], Fox, et al. [7, p. 159], 
and others, with a bibliography in {5, p. 118]. The essence of the method is 
the reduction of a symmetric matrix A to the product S’S—in which S is a 
triangular matrix with zeros below the main diagonal, and S’ is its transpose. 
Then, if A and hence S is non-singular, A~ can be obtained either by inverting 
S and utilizing the relation 


A-t = S(S-?’, (1) 


or through a short cut described by Dwyer [5, p. 199]. But, as a rule, A- is 
wanted only for use in subsequent calculations, and not for its own sake. Both 
S and S-', however, may be of interest in their own right; moreover, calcula- 
tions ordinarily involving A~ can often proceed directly from S~ through (1). 
Hence, there is a potential opportunity to condense the overall computation 
program by eliminating the formal calculation and recording of A~'. This note 
will give a few examples of possible use for S and S~. 

Either S or S-' may be useful in evaluating quadratic forms. For example, 
in testing the statistical hypothesis that an estimated set of regression co- 


efficients 6;, bs, - - - , 6, has in fact the values b,*, b.*, - - - , ba*, one encounters 
the form 


x E haya; = HAH’ (i,j, = 1,2, - + -n) 


where H is a row vector of the differences h;=};—b;*, and the a, are either 
correlation coefficients or sums of squares and products. Or again, in evaluating 
Hotelling’s generalized 7 or the standard error of a linear combination of re- 
gression coefficients, one encounters forms of the type 


> > hat = HAH’, 


where the h; may be means, mean differences, or the coefficients of an arbitrary 
combination > ~hibx. 
The square root method reduces quadratic forms to sums of squares, as 
follows: 
> ¥ haya; = HAH’ = HS'SH’ (2) 
(i,j = 1, 2, pity *n) 
> > hihjat = HAH’ = HS-(S-)'H’, (3) 


and these reductions may simplify calcul. :tions—especially for (3). The matrix 
product HS-' is a row vector containing the n elements 


hys"?, hys® + hos®?, - - - , Ays™ + hos + - + + has, (4) 
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and (S-)’H’ is a column vector containing exactly the same elements. So, 
HS-(S~)’H’ in (3) is the sum of the squares of the elements of (4), which 
can be computed as follows: multiply each row of S~' by the appropriate h,, 
sum the columns, square the sums, and sum the squares. This requires 
n(n+1)/2 multiplications for the rows plus n more to square the column sums 
—a total of n(n+3)/2 multiplications. On the other hand, direct evaluation of 
HAH’ given A- requires n(n+1) multiplications (owing to the symmetry of 
A-)—or n(n—1)/2 more than are required with S~'. This substantially heavier 
multiplication load is, of course, in addition to any labor expended in obtaining 
A- from S-. So, unless the computation program greatly favors direct calcu- 
lation of A~', it would appear that (3) is more efficiently evaluated from S 
than from A~. 

An even simpler evaluation of HA~'H’ is possible by means of the simultane- 
ous reduction 


A H’ 
S (S-)’H’ 


(5) 


in which the square root method, formally equivalent to premultiplying A and 
H by (S-)’=(S’)-, yields the elements of (4) as the columr vector (S—')’H’. 
This reduction short-cuts calculation and recording of S-', which is advanta- 
geous when S~ is not wanted for its own sake and when only a few quadratic 
forms of type (3) are to be evaluated. Given the problem of evaluating a large 
number of forms for a large number of different sets of A;, preliminary calcula- 
tion of S- might still be advantageous. 

Although regression coefficients are frequently, if not usually, calculated by 
means of a back solution without recourse to matrix inversion, the requirements 
of the problem may favor using an inverse—sometimes A~', sometimes S~'. 
One might, for example, wish te select any variable, say z,, from among 
21, 2, * + * Z, and regress it upon all other x;—for which purpose A~! would be 
useful, since 

a* 


bas-12..-n = — —" 


akk 


On the other hand, one might prefer to regress any 2; OM 2%, 22, - * * X—1—in 
which case S-' would be indicated, since 
gikgkk gik 
_ wt ew ek 6 6) 
(s**)? gkk ( } 


In the above, a, denotes an element in lla,,||- (t, j=1, 2, - - - k). Subscripts 
are not required for the s-elements, because S~' has the interesting property 
that lIs.4||— (t, j7=1, 2, - - - k) is identical with the first k rows and columns of 
sill (¢, j= 1, 2, - + - n). 
The relation (6) is also the basis of an interesting interpretation of S and S—. 
From the regression coefficients of (6) we can define a set of mutually uncorre- 
lated variables 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1956 
n= My 
Ye Ze — bar, 


Zs — byi.ot1 — Daarte 


Yn = Pn — > Dag -19- «nie, 
t=] 


in which each y, represents the residual of z, about its regression on all pre- 
ceding x;. Next, by letting y,=2.s" and by using (6), we get a new set of un- 
correlated variables Z = XS, of which 


k 
ze = >, sz. 


t=) 


Although z;, (like y,) is uncorrelated with x, to, « - « Ze~1, 2, is ordinarily corre- 
lated with the remaining z;; and these correlations are of considerable interest. 
In particular, the correlation coefficients ry,2,=Ta.2, are the coefficients of 
alienation 1, /1—ra1?, /1—r3.n*, ete.; and tye;=Teys, (= k+1, k4+2, - + - n) 
have been called semi-partial correlation coefficients (see, for example, Summer- 
field and Lubin [8, p. 275]). 

Whenever A is a correlation matrix (traditionally designated R), the coeffi- 
cients of alienation are the diagonal elements of S, and the semi-partia] correla- 
tion coefficients are the non-diagonal elements. Specifically, 





Sez = Type; = Tags; (gG=k,k+1,--+n). 


It should be noted in passing that the semi-partial correlation coefficients differ 
from the conventional partial correlation coefficients—and also from the. part 
correlation coefficients of Ezekiel [6, p. 497]. 

The accompanying numerical example illustrates some of the characteristics 
of the square root method, including calculation of the standard error of a linear 
combination of regression coefficients. In this example, which derives from an 
earlier discussion [2] of joint confidence regions for regression coefficients, the 
dependent variable X, (log bank stock price) is regressed on X, (log capital), 
X, (log number of shares), X, (log dividends). The regression coefficients of 
X, on X,, Xo, and X; are obtainable from the fourth column of S-', according 
to (6), or by a conventional back solution with S. A particular linear combina- 
tion of these coefficients, for which one might want a standard error, is the sum, 
.6512 — .9482+- .2634 = —.0336. For evaluation of the quadratic form 
> Shh" (i, j=1, 2, 3) the example illustrates reduction of the column 
vector H’=(1, 1, 1) according to (5). Then, since Sy =.1084 is the square root 
of the sum of residual squares of X, around its regression on X,, Xo, and Xz, 
one readily obtains the desired standard error from 





.1084./(.5542? + .0425? + .18872)/N, 
where N is the number of observations or degrees of freedom, as required. 
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Since A, in this example, is not a correlaticn matrix, S contains no coeffi- 
cients of alienation or semi-partial correlation. One may obtain these coefficients 
however, by merely dividing the square roots of the diagonai terms of A into 
the corresponding columns of 8. 

In the earlier discussion of this example, S was obtained as a by-product of 
the Doolittle method rather than directly from the square root method. While 
this roundabout procedure appears inefficient in general, it deserves a word of 
explanation, since special circumstances may sometimes favor it—as for exam- 
ple, when the Doolittle solution has been previously calculated. The Doolittle 
method factors A into two triangular matrices 


A = GB, 


of which G has zeros above its diagonal and B= DG’, with D representing the 
diagonal elements of G. Then (see Dwyer [3, p. 88]) 


A = GD"“'G@’ = (GD-"?)(D-°@’) = S'S. 


This means that each row of S is obtainable by dividing the corresponding 
column of G by the square root of its diagonal element. 

In closing, I wish to offer two acknowledgments. First, most of the work on 
this note was performed under a grant from the Sloan Research Fund of the 
School of Industrial Management at M.I.T. Second, Professor Paul 8. Dwyer, 
who kindly consented to read the first draft, contributed a number of useful 
suggestions. 

For an early discussion of semi-partial correlation, see Jack W. Dunlap and 
Edward E. Cureton, “On the Analysis of Causation,” Journal of Educational 
Psychology, 21 (1930), 657-80. 


ILLUSTRATIVE EXAMPLE OF THE SQUARE ROOT METHOD 


Note: In this layout, the A-matrix consists of sums of squares and products 
of the variables X,, X2, Xs, and X, about their respective means. For further 
description see [2, pp. 133-4]. In the reduction of the A-matrix, calculations 
were carried to eight decimal places and then rounded to four for presentation. 


A x’ 
3.2554 3.5128 . 4260 — .3084 
3.5128 7.2594 .5538 —3 .6596 
3.4260 3.5538 .6985 ~ 1644 
— .3084 —3 .6596 - 1644 3.2376 


(S—)/H’ 
1.8043 ‘ -8988 -5542 


.0768 — .0425 
-2951 — .1887 


ba 


— .9482 
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ON APPROXIMATING THE POINT BINOMIAL 


Morton §. Rar 
U. 8. Bureau of Labor Statistics 


This paper presents the results of a systematic numerical investiga- 
tion of the comparative accuracy of several approximations to the 
cumulative binomial distribution. The approximations studied include 
the normal, aresine, Poisson, and six others. Two of the little-known 
approximations were found to be extremely good: the Poisson Gram- 
Charlier when the probability p is fairly small, and the Camp-Paulson 
approximation almost everywhere. 


I. INTRODUCTION 


HE point binomial distribution has a simple formula, but this does not mean 

that numerical values are easily computed. Individual terms may involve 
factorials of large integers, and cumulative probabilities may require the sum- 
mation of a large number of individual terms. In many applications the labor 
required to compute exact probabilities is prohibitive. 

To cope with this situation, tables have been published! and a variety of 
approximating functions have been studied. The latter are useful when special- 
ized tables are not available or do not cover the necessary range; they are also 
of value in theoretical work. In order to use them intelligently it is necessary to 
know something about the conditions under which one approximation may be 
better than another. The present paper is an exploration of the properties of 
some of the most useful approximating functions and a comparison of their 
respective merits. A fuller account of the work may be found in [13], which 
includes a 21-page table of computed values of nine approximating functions. 

In this paper we shall test approximations to the cumulative binomial proba- 
bility, defined as follows. Let k have integral values from —1 to n inclusive, 
and let 

k 


! 
B(k, n, p) = > ——-—— (1 — p)* if k20;  B(-1,n,p) =0. (1) 
imo t!(n — 4)! 


Since this probability has the symmetrical property that 
Bk, n, p) = 1 — Bin — k — 1, n, Q) where g=1-—p, (2) 


there is no need to calculate B(k, n, p) for values of p greater than one-half. 
The comparisons are therefore confined to values of p below one-half. 


II, NEED FOR THIS STUDY 


While no originality is claimed for any of the approximations discussed here: 
this is believed to be the first comprehensive numerical investigation of their 
comparative accuracy. Erroneous opinions are common, such as the belief that 
the Poisson approximation is good only when p is small and n is large.* (Actually 





1 The best tables are {1, 8, 11, and 15]. Particularly good are the Harvard tables [8], which give values to five 
decimal places for n =1(1)50(2)100(10)200(20)500(5C) 1000 and p =.01(.01).50 plus multiples of 1/16 and 1/12; 
and the Army tables [1], which give values to seven decimal places for n =1(1)150 and p =.01(.01).50. 

? This implication appears, among other places, in Mood's textbook (9, p. 61] and in the introduction to the 
Harvard tables (8, p. xviii). 
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the size of n is irrelevant.) About the only systematic numerical investigation 
has been that of Freeman and Tukey [5, 6], who confined their attention to a 
limited class of approximations and used a criterion of accuracy different from 
the one presented here. 


Ill. CRITERION OF ACCURACY 


There is more than one way of judging the closeness of a given approximation. 
Uspensky [16] and Mood [9], in discussing the normal approximation, con- 
sidered errors in the probability B. Freeman and Tukey [5, 6] measured their 
errors in terms of the normal deviate corresponding to the probability B; this 
reduces the differences near the center of the distribution and magnifies those 
in the tails. Still other criteria are possible, such as the relative error in the 
cumulative probability. Different criteria serve different purposes, and any 
choice is bound to be somewhat arbitrary. 

We shall judge the closeness of the ith approximation B,(k, n, p) by examin- 
ing the maximum error M ,(n, p), defined as the largest possible error which can 
arise in estimating any sum of consecutive binomial terms with the specified 
parameters. To be precise, if we define the error 


Exk, n, p) = Bik, n, p) — B(k, n, p), (3) 
then 
Mn, p) = Max;, | Ej, n, p) — Ek, n, p)|, (4) 


where j and k can take any integral values from —1 to n inclusive. 


IV. KINDS OF APPROXIMATIONS TESTED 


Of the nine approximations examined in the original study [13], only the 
best six are presented here. These fall into four groups: (a) the normal and 
normal Gram-Charlier approximations, (b) the arcsine approximation, (c) the 
Poisson and Poisson Gram-Charlier approximations, and (d) the Camp-Paulson 
approximation. 

A. Normal-type approximations. The normal approximation is the most 
widely used of all approximations to the binomial distribution because of its 
simplicity and the easy availability of the necessary tables. The normal ap- 
proximation is defined simply as a normal distribution having the same mean 
and variance as the binomial: 


Bulb, my 9) (ed f * olédt, 


where (5) 
o(t) = e-"2/\/2e and x = (k +4 — np)/Vnpg. 


Table 1 lists the maximum errors of B, as a function of n and p, and also as a 
function of n and np. For constant n the maximum error M,(n, p) decreases as 
p increases to 4 (except when p is very near zero, where the trend is reversed). 
For constant p it decreases with increasing n. When the mean np is held fixed, 
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TABLE 1 
MAXIMUM ERROR OF THE NORMAL APPROXIMATION B, 
Values Values of n 
of p 5 i0 25 50 100 250 500 ~ 
.002 .126 0 
.004 .125 .082 0 
.008 .082 0 
01 .124 .046 0 
-02 ~122 .080 .046 .032 0 
.04 .118 .077 .030 0 
.05 .044 0 
.08 .071 0 
1 .158 .106 .060 040 .027 0 
2 .086 .054 .032 .022 .015 0 
3 .054 .032 .019 .013 .009 0 
4 .024 .016 .009 006 .004 0 
5 O11 .005 .002 .001 .001 0 
Values 
of np 
0 0 0 0 0 0 0 0 0 
.45 .186 
5 .158 .185 
1 .086 .106 118 .122 .124 .125 .126 . 126 
1.5 .054 .109 
2 .024 -054 071 .077 .080 .082 .082 .083 
2.5 O11 .060 .073 
3 .032 
4 .016 
5 .005 .032 .040 .044 .046 .046 .047 
7.5 .019 
10 .009 .022 .027 .030 .032 .032 
12.5 Yee 
15 .613 
20 .006 .015 .023 
25 .001 
30 .009 .018 
40 .004 .016 
50 .001 .014 


the maximum error increases with increasing » up to a limiting value which 
represents the error in the normal approximation to the Poisson distribution. 

Fig. 1 shows the values of n and p for which this maximum error is equal to 
0.65, 0.025, or 0.01. While it is difficult to formulate simple rules concerning the 
accuracy of the normal approximation, the following may sometimes be helpful. 
M,(n, p) is always less than 0.140/./npq, but this inequality is not very re- 
strictive.* It is also true that M,(n, p) <0.05 whenever np*/?> 1.07. It has not 





2 Cf. Mood’s statement [9, p. 142} that M, <0.15/ ¥ npg provided npg >25. The proviso is unnecessary, and the 
inequality can be made « little tighter. 





¢ 
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Fie. 1. Maximum error of the normal approximation. 


been found possible to develop similar expressions for other values of the maxi- 
mum error. 

The normal Gram-Charlier approximation B, consists of B, plus an adjust- 
ment term matching the skewness of the poin‘ binomial: 


Bx(k, n, p) = &(x) — (q — p)o’"(z)/6/npg, (6) 


where z is the same as in (5) and $’’(z) = (z*—1)¢(z). 

Table 2 is the counterpart of Table 1 for the approximation B,. This approxi- 
mation sometimes has a negative value, but the magnitude of the negative 
values can never exceed 0.001 when mp>0.2. The patterns for M,(n, p) are 
similar to those of M,, with the errors very much smaller. It can be shown that 
M,(n, p) <0.056/+/npq for all values of n and p. 

B. The arcsine approximation. Somewhat different from these normal-type 
approximations is the arcsine approximation B;, which is based on the variance- 
stabilizing angular transformation. (An extended discussion of this transforma- 
tion may be found in Eisenhart [3].) Of the many variants of this transforma- 
tion—e.g. the chordal transformation embodied in binomial probability paper 
[10 ]—the one tested here is: 


Bik, n, p) = &[2./n (arcsine /(k + 9)/n — arscine +/p) | (7) 
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MAXIMUM ERROR OF THE NORMAL GRAM-CHARLIER 


Values 
of p 
.002 
-004 
.008 
01 
.02 


-054 
-040 


-020 
-012 
O11 


.054 
.040 
-020 


.012 
-O11 


except that when k:- 


TABLE 2 


APPROXIMATION B, 
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Values of n 
10 25 50 100 250 500 ” 
-055 0 
.055 .024 0 
.024 0 
.054 .009 0 
.054 .023 .009 .004 0 
.052 .022 .004 0 
.008 0 
.020 0 
.044 .013 .007 -003 0 
-015 -006 -003 -002 0 
.009 .003 -002 .001 0 
.006 .002 .001 -001 0 
.005 .002 -001 -001 0 
0 0 0 0 0 0 0 
- 100 
. .067 
.044 .052 .054 .054 .055 .055 -055 
.031 
.015 .020 .022 .023 .024 .024 .024 
.013 .016 
.009 
.006 
.005 .006 .007 .008 -009 .009 .009 
.003 
.002 .003 .003 .004 .004 .004 
.002 
.002 
.001 .002 -002 
001 
. 001 .002 
-001 .001 
.001 .001 
—1 and n these values are to be replaced by k= —} and 


n—} respectively. Table 3 shows the maximum errors M,(n, p). Their pattern 
is similar to those of M, and M2, except that the increasing trend with decreas- 


ing p goes all the way to p=0. Like Mj, M; never exceeds 0.140/+/npq. 


C. Poisson-type approximations. The Poisson and Poisson Gram-Charlier‘ 
approximations behave quite differently. They are defined as follows. Let k 





+ This approximation consists of the first two terms of what Riets [14] calls a Gram-Charlier Type B series. It 
matches the variance as well as the mean of the point binomial. 











298 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1956 


TABLE 3 
MAXIMUM ERROR OF THE ARCSINE APPROXIMATION B, 
Values Values of n 
of p 5 10 25 50 100 oo 
1 .166 .098 .043 .028 .019 
2 .084 .043 .023 .015 .010 
3 .048 .024 .013 .009 .006 
4 .030 .015 .007 .004 .003 
6 .024 .012 .004 002 -001 
Values 
| 
of np 
0 .579 
5 . 166 . 185 
1 .084 .098 .110 .112 
1.5 .048 .080 
2 .030 -043 .056 .059 
2.5 .024 .042 .050 
3 .024 .044 
4 015 -037 
5 .012 .023 .028 .032 
7.5 013 026 
10 .007 015 .019 .022 
12.5 .004 .020 
15 .009 .018 
20 .004 .010 -015 
25 .002 .014 
30 .006 -012 
40 .003 O11 
50 .001 -010 


have integral values from 0 to n inclusive and let 
k 
P(k, m) = e-™ > mif/il, = AP(k, m) = e~™m*/ki!. 
Then the Poisson approximation 
Be(k, n, p) = P(k, mp) (8) 


and the Poisson Gram-Charlier approximation 


B,(k, n, p) = P(k, np) + 3p(k — np) AP(k, np). (9) 


These approximations have the interesting property that their maximum errors 
are practically independent of n (see Table 4). They depend only on the prob- 
ability p, approaching zero as p decreases toward zero. The Poisson Gram- 
Charlier approximation B; is quite good even for fairly substantial values of p. 

The irrelevance of the size of n to the goodness of these approximations can 
also be shown under the Freeman-Tukey criterion of accuracy. 
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TABLE 4 
MAXIMUM ERROR OF THE POISSON APPROXIMATION B, 


Values of n 
10 25 50 100 250 500 
-0006 
-001 -001 
-002 
-003 .002 
.005 .005 .005 


-O11 
-012 

-019 
-025 -029 -027 .026 -026 
-063 -052 -055 .055 .054 


-090 -086 088 086 -087 
125 .127 124 124 .123 
-177 .172 -166 - 167 -167 


MAXIMUM ERROR OF THE POISSON GRAM-CHARLIER 
APPROXIMATION B, 


Values Values of n 
of p 5 25 50 ~ 
1 .002 ; .002 .001 x .002 
2 .007 4 .006 .006 : .006 
3 .014 : .015 .016 J .016 
4 


: -035 ‘ -031 -031 ‘ -030 
5 -051 -052 -054 -053 -05: .053 


D. The Camp-Paulson approximation. The Camp-Paulson appreximation 
By is based on an altogether different principle from those considered thus far. 
Instead of trying to match or stabilize the moments of the binomial distribu- 
tion, this approximation proceeds from the equivalence of a cumulative bi- 
nomial probability to an incomplete beta-function ratio and thence to a prob- 
ability integral of the variance ratio, F. Using an approximation to the integral 
of F developed by Paulson [12|—who in turn used Wilson and Hilferty’s 
approximation for the distribution of chi-square [17] and the result obtained 
by Fieller [4] and Geary [7] concerning the ratio of two normally distributed 
variates—Camp [2] developed an explicit expression which may be written as 
follows: 


By(k, n, p) = &(y/3V/2), (10) 
where 
y = [(n — k)p/(k + Ig} 4[9 — 1/m —k)] +1/k +1 -9 
and 
z= [(n — k)p/(k + 1)q}**[1/(m — &)) + 1/(k + 2). 
Experience indicates that this complicated expression takes no longer to com- 


pute than the normal Gram-Charlier approximation B,, which is much less 
accurate. 
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TABLE 5 
MAXIMUM ERROR OF THE CAMP-PAULSON APPROXIMATION B, 


Values Values of n 
of np 10 25 50 100 250 500 
0 
.02 
5 
1 
1.5 


cr ® & bo to 
i) 


- 


‘. 


10 .001 .001 


Table 5 lists the maximum errors of the Camp-Pauison approximation. The 
error M,(n, p) is strictly limited, with an absolute maximum of 0.0122 which is 
never exceeded for any values of n and p. It is essentially a function of the 
mean np, and tends to decline with increasing np when np>0.02. In comparison 
with M,; and M; (which never exceed 0.140/./npq) and M; (which never ex- 
ceeds 0.056/+/npq), the maximum error My never exceeds 0.007/+/npq. 


V. COMPARISON OF APPROXIMATIONS 


In terms of both accuracy and complexity, the six approximations fall rather 
naturally into two groups. The less accurate “simple” approximations are the 
normal B,, the aresine Bs, and the Poisson Bs. The “advanced” approximations 
are the normal Gram-Charlier B., the Poisson Gram-Charlier B;, and the 
Camp-Paulson By. The poorest of the “advanced” approximations is almost 
always more accurate than the best of the “simple” ones in any situation 
where it would seem appropriate. 

Figs. 2 and 3 show the ranges of n and p which favor each of the approxima- 
tions. Among the “simple” approximations, the Poisson is generally the best 
when p is less than about 0.075. For larger values of p the arcsine approxima- 
tion is usually the best, although the normal approximation overtakes it when 
p gets very close to one-half. 

Among the “advanced” approximations, the Poisson Gram-Charlier is best 
in about the same range as that which favors the Poisson among the “simple” 
approximations. Everywhere else the Camp-Paulson approximation is the best. 
There is no day in the sun for the normal Gram-Charlier, which is never as 
good as the Camp-Paulson except in a small region where the Poisson Gram- 
Charlier is still better. 

In conclusion, one needs only two approximations to match the cumulative 
binomial distribution almost exactly. For small values of p the Poisson Gram- 
Charlier approximation is exceedingly accurate; for larger values the same is 
true of the Camp-Paulson approximation. The maximum error can be kept 
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Fig. 2. Comparison of the maximum errors (/) of the “simple” approximations. 


below 0.005—provided we exclude values of n <5, where it is really unnecessary 
to use any approximation—by following the simple rule of using the Poisson 
Gram-Charlier approximation if np $0.8 and the Camp-Paulson approximation 
if np 20.8. It is hoped that the present paper will serve to acquaint more people 
with the merits of these two excellent approximations. 
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INTRODUCTION AND SUMMARY 


N RECENT years the practice of filing corporate income-tax returns for an 
I accounting year other than the calendar year has been steadily and rapidly 
extended. This development of so-called “fiscal-year” returns is of high im- 
portance to various specialists and groups of specialists who are concerned with 
the preparation and filing of returns, or with interpretation of statistics com- 
piled from the returns. 

To members of the accounting profession, this development means an im- 
portant reduction in the peak load of auditing, which has long occurred in the 
interval between the end of the calendar year and ‘the March 15 filing date 
applicable to most corporations. Corporate officers, in turn, find their burdens 
considerably reduced, since they can collaborate with auditors under circum- 
stances less marked than formerly by pressure and urgency. In fact, much of 
the expanded use of fiscal-year reporting can be credited to efforts of the ac- 
counting profession to bring about these changes in the time distribution of 
the auditing load. 

To Treasury officials responsible for having funds available for day-to-day 
outlays of the government, the spreading of the corporate reporting year to 
periods other than the calendar year means a considerable smoothing out of the 
monthly flow of revenue from the corporate income tax. Until recent years, tax 
receipts of this sort came mainly in the months of March, June, September, and 
December. In view of the large share of the corporate income tax in total rev- 
enues and of the fact that receipts from the other major source—the individual 
income tax—were predominantly in the same months, the Treasury’s short- 
term financing problem of meeting needs for expenditures in the intervening 
months was somewhat aggravated. The provisions under the Mills amendment 
of the Internal Revenue Code for acceleration of the payment of the corporate 
income tax would greatly intensify the uneven monthly flow of revenue from 
this source if nearly all corporate income were reported on a calendar-year 
basis. With the expanding use of fisee-year returns, however, and particularly 
insofar as the tax liability of such returns is fairly evenly spread over the various 
months of the year, the monthly flow of revenue from the corporate income 
tax will during the peak effect of the Mills amendment be less uneven than if 
all reporting were for the calendar year. Under the 1954 revision of the Internal 
Revenue Code, corporate tax payments will soon again be distributed in quar- 
terly installments, in the second half of the year in which the income is earned 
and in the first half of the following year. Treasury receipts of revenue will con- 
tinue to come chiefly in the quarterly months, creating temporary large cash 
balances in the Treasury, with corresponding effects on the general economy 
which may in fact be more significant than the short-term Treasury financing 
problems mentioned above. These effects will be somewhat restrained by the 
increasing practice of filing fiscal-year returns, to the extent that fiscal years 
do not end in the four quarterly months. 

To those experts of the Treasury and the Bureau of the Budget concerned 
with forecasting the federal revenue, the task of predicting the corporate in- 
come subject to tax, and estimating from that figure the amount of tax liability, 
is notably aggravated by the presence—and the currently increasing impor- 
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tance—of fiscal-year returns. These experts normally must provide, well in 
advance of the annual Budget Message, an estimate of revenue receipts for the 
fiscal years ending in the following June and in June of the next succeeding ' 
year. This means a forecast of the corporate income-iax liability covering a 
period reaching ahead at least a year and a half. And for a like period they must 
also forecast—among various other elements—the flow of dividends from cor- 
porations to individuals, which has a highly important effect upon the expected 
individual income-tax liability. Basic to both these forecasts is a prediction of 
corporate net income during the pertinent reporting years. This prediction, 
in its essence and ignoring many complexities, requires a study of the correla- 
tion of corporate profits with various economic factors such as commodity 
prices, industrial production, and other measures of business activity or con- 
dition. For these factors, dependable figures are in general much more nearly 
up to date than are any adequately comprehensive figures on corporate profits; 
and yet, even for these factors, some projection of figures into the unknown 
future is necessary before deriving the estimate of corporate profits. If all 
corporate income were reported on a calendar-year basis, this task of predicting 
corporate profits by such correlation methods would be sufficiently difficult and 
uncertain. When, however, a substantia] fraction of corporate income is re- 
ported for years not ending in December, the task becomes much more compli- 
cated because of the possible need for studying the relevant economic factors 
for the various periods pertinent to various reporting years of corporations. 
This difficulty is likely to be intensified whenever substantial and fairly rapid 
cyclical changes occur in any of the economic factors. It is also intensified by 
the fact that the selected fiseal year tends to end in a particular month for most 
corporations in one line of industry, and in another month for those in some 
other industry, and by the fact that cyclical and other business changes have 
widely different impacts upon different lines of industry. Moreover, so long as 
the present tendency toward increased fiscal-year reporting continues, the use 
of compilations from corporate tax returns of preceding years as background for 
estimating figures in the near-term future is somewhat obstructed by the lack 
of stability in the relation of fiseal-year figures for various terminal months to 
figures for the calendar year. 

To many other specialists—for example, those engaged in financial analysis, 
in describing and interpreting variations in national income, or in appraising 
general economic conditions—whether within or outside of government, the 
steady shift toward fiscal-year reporting may be highly important. Since 1916 
the United States Treasury has published annually Statistics of Income, which 
shows a wide variety of highly useful tables compiled from income-tax returns. 
We are here interested in such tabulations from corporate tax returns. Apart 
from those derived from the balance sheets which accompany most corporate 
tax returns, these tabulations include aggregates of income-account items for 
the corporate system as a whole, and for various groups within the system— 
classified chiefly according to line of industry and size of enterprise. As long as 
the fiscal-year returns were a very minor fraction of the total, analysts could 
assume that the tabulated aggregates were approximately pertinent to a 
caleudar year—a year centered at July 1. Now that fiscal-year returns have 
become a much larger fraction of the total, this assumption may no longer be 
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valid. The Statistics of Income aggregates, published for the corporate system 
as a whole for a specified year, pertain to a range of twelve-month periods with 
their terminal dates varying from July 31 of that year to June 30 of the follow- 
ing year, and cover also various accounting periods—so-called part years—of 
less than twelve months. Although the bulk of such an aggregate still pertains 
to a year ending December 31, the aggregate as a whole pertains to an “average” 
year centered somewhat later than July 1. For certain lines of industry, the 
aggregates may pertain to an average year centered notably later (or earlier) 
than for the entire system. And for some lines of industry, even the bulk of the 
returns may belong to a year ending in a single month other than December. 
In these instances, the assumption that the average year centers at July 1 is 
no longer even approximately valid, and we need to examine the question 
whether it is sufficiently valid even for the system as a whole. Analysts should 
take due notice of any impairment in its validity in any study which requires 
assigning a date to the figures for corporate profits or other income-account 
items (such as a comparison of any Statistics of Income aggregate with factors 
reflecting general or specific economic fluctuations or conditions). An additional 
difficulty appears whenever comparisons among records of quarterly figures are 
needed. Corporations which file fiscal-year tax returns are likely to use the same 
fiscal years for published corporate statements, and, unless the fiscal year ends 
on one of the quarterly months of the calendar year, the resulting quarterly 
figures are not readily comparable with other quarterly records. 

In view of the importance of the recent expansion in fiscal-year reporting to 
the specialists mentioned above, and perhaps others, I have undertaken an 
analysis of most of the special tabuiations of fiscal-year returns which have been 
compiled by the Bureau of Internal Revenue and published in successive annual 
issues of Statistics of Income during a period of over two decades ending in 1950.' 
I report in the following sections the major results of this analysis, along with 
my admittedly limited commentaries upon certain implications of the more 
important findings. Part I is concerned with fiscal-year reporting in the cor- 
porate system as a whole, without regard to differences according to size or to 
line of industry ; Part IT, with differences among lines of industry; and Part III, 
with differences according to size. 

Summary of main results. The percentage of the total number of corporate 
income-tax returns filed on a fiscal-year basis increased from about 12 in 1928 
to about 34 in 1950. The annual figures (from Table 1, in Part I) are as follows: 


1928 12.4 1940 19.7 
1929 12.0 1941 20.6 
1930 12.8 1942 21.8 
1931 12.9 1943 22.6 
1932 13.3 1044 23.6 
1933 12.1 1945 24.8 
1934 14.3 1946 26.9 
1935 15.0 1047 30.8 
1936 15.9 1948 32.6 
1937 17.3 1949 83.9 
1938 17.9 1950 33.7 
1939 18.7 





1 In Appendix A I give as full a list as I could compile of references to such special tabulations and the relevant 
textual comments in Statistics of Income. Further footnote references to Statistics of Income will be indicated a3 fol- 
lows: S. of I., followed by the year and the page number or numbers. For those years (1934-1950) for which corporate 
statistics are in Part 2 of S. af I., references will be to Part 2. 
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The evidence based upon number of returns is not entirely satisfactory, because, 
as is brought out in Part III, fiscal-year reporting tends in general to be more 
common among small than among large corporations. Figures yielding per- 
centages in terms of total assets are available only for the years 1946-1949, but 
an estimate can be made for 1934. For these years the percentages of total 
assets of all corporations filing balance sheets that were reported on fiscal-year 
returns were (Table 3) as follows: 


1934 (estimated) 8.9 
1946 11.3 
1947 12.2 
1948 12.9 
1949 13.2 


These total-assets percentages are not only much smaller than the number-of- 
returns percentages, but they also show a less striking rate of increase in recent 
years. 

The distribution among the various accounting periods has changed notably 
between 1928 and 1950. The different accounting periods include: the calendar 
year; 11 fiscal-year periods, ending at the end of each of the five months pre- 
ceding and six months following December; and various part years. The per- 
centages of the total number of returns tabulated in Statistics of Income for 
1928 and for 1950 in these 13 accounting periods were (Table 4) as follows: 


1928 1950 1928 1950 
Part-year 7.50 6.18 January 1.47 2.87 
July 0.84 2.28 February 0.87 2.46 
August 0.82 2.52 March 1.18 4.12 
September 0.89 3.62 April 1.08 2.74 
October 0.83 2.84 May 1.16 2.56 
November 0.82 2.20 June 2.40 5.54 
December 80.14 60.07 


The 1928-1950 decline in the calendar-year percentage reflects the general in- 
crease in fiscal-year reporting noted above. For each of the i1 fiscal-year pe- 
riods, a notable increase appeared between 1928 and 1950. In both years, June 
was the most common fiscal-year period; March stood next in 1950, whereas 
January was the second most common period in 1928. The 1928-1950 changes 
in comparative size among the 11 fiscal-year percentages are in considerable 
degree due to diversities in fisecal-year reporting among lines of industry, which 
are examined in Part IT. 

As number of returns is not for most purposes a satisfactory basis on which 
to measure the importance of fiscal-year reporting, the distribution among ac- 
counting periods is examined also in terms of total assets (Table 6). The basic 
figures required are unfortunately available only for the years 1946-1949; and, 
as total assets for part-year returns are not shown separately, the percentage 
for “December” combines the calendar-year and part-year returns. The per- 
centages of total assets tabulated for 1946 and 1949 from all returns accom- 
panied by balance sheets filed for each of the eleven fiscal-year periods, and for 
the combination of calendar-year and part-year periods, are: 
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1946 1949 1946 1949 
July 0.74 0.88 January 1.45 1.56 
August 0.84 1.03 February 0.55 0.64 
September 1.05 1.38 March 0.79 1.09 
October 1.29 1.46 April 0.74 0.86 
November 1.20 1.20 May 0.68 0.78 
December 88. 86.81 June 2.01 2.31 


For fiscal-year periods ending in every month except November, the percentage 
rose from 1946 to 1949. The June period in each year is the most important, in 
terms of total assets, and that for January is second most important. 

Net income is a far less satisfactory measure of importance, for general pur- 
poses, than total assets. But for purposes relating to the flow of revenue from 
corporate taxes, the net-income measure, particularly of corporations showing 
net income rather than deficit, may have high significance. Moreover, the basic 
figures are available over a longer period than those for total assets. The per- 
centages of the combined net income of net-income and deficit corporations 
reported on returns filed for the various accounting periods as tabulated in 
1928 and 1950 are: 


1928 1950 1928 1950 
Part-year —1.63 1.00 January 1.41 3.11 
July 0.33 1.14 February 0.93 1.13 
August 0.97 1.49 March 0.41 1.91 
September 0.60 2.10 April 0.69 1.28 
October 0.94 2.36 Muay 0.54 1.39 
November 1.51 2.02 June 2.21 3.73 
December 91.07 77.35 


The part-year figure for 1928 is negative because in that year part-year returns 
showed a deficit, while all returns showed net income. 

Fiscal-year reporting varies greatly among lines of industry. Statistics of In- 
come classifies returns according to eight broad divisions, excluding a category 
of returns which cannot be classified, and breaks down most of the divisions 
into more detailed groups and subgroups. We surmmarize here, from Part II, the 
chief figures for the broad divisions in 1949. The first column in the table below 
indicates the percentage of each division’s total number of returns accompanied 
by balance sheets that was on a fiscal-year basis. The second column gives the 
percentage of each division’s total assets reported on fiscal-year returns. 


In Terms In Terms In Terms in Terme 
of Total of Total 
of Number iets of Number Assets 
Agriculture 44.8 37.9 Public utilities 22.2 1.8 
Mining 31.8 15.4 Trade 41.4 46.3 
Construction 36.6 33.9 Finance 18.3 4.2 
Manufacturing 40.2 23.3 Services f 37.9 40.5 


The variation among the divisions is marked for the number-of-returns per- 
centages, and very striking for the total-assets percentages. On both bases, the 
lowest percentages are for Public utilities and Finance; and for both of these 
the total-assets percentages are so much lower than the number-of-returns per- 
centages that we may conclude that nearly all of the larger corporations in 
these two divisions file calendar-year returns. 
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Detailed evidence that fiscal-year reporting is in general more common 
among small than among large corporations is presented in Part III. It may be 
summarized (from Table 15) in the following figures showing average total 
assets (in thousands of dollars) per return in 1949 for the entire corporate sys- 
tem and for each industrial division. 


All Returns Fiscal-Y ear Pts. ry 


All divisions combined 980 1,330 

Agriculture 284 319 

Mining 1,144 1,419 

Construction 198 207 

Manufacturing 1,122 1,440 

Public utilities 3,184 4,016 

Trade 229 210 

Finance 1,930 375 2,516 

Services 152 162 145 
For every division except Trade and Services, the average of total assets is 
lower for fiscal-year than for other returns. The smaller average size of corpora- 
tions filing on a fiscal-year basis is especially striking for Public utilities and 
Finance; this is in accord with our finding above that nearly all large com- 
panies in these lines file calendar-year returns. 


PART I. THE CORPORATE SYSTEM AS A WHOLE 


1. Fiscal-year tabulations available. The earliest special tabulations from 
fiscal-year returns of corporations in Statistics of Income are in the volume for 
1926, which includes also fiscal-year tabulations for 1925. A largely similar set 
of tables appears in the 1927 issue. These fiscal-year tables for 1925-1927, how- 
ever, cover only returns which met at least one of the four following tests: net 
ineome of $2,000 or over, net deficit of $500,000 or over, gross sales or other 
items of $5,000,000 or over, and deduction because of net loss for prior year. 
Hence, a large number of fiscal-year returns showing very small net income or 
showing small or moderately large deficit were excluded. Evidence for later 
years (see Part III) indicates that the excluded cases in 1925-1927 were prob- 
ably numerous and important in the aggregate. Moreover, as some possibility 
exists that smal! corporations may frequently be in industries with peculiar 
patterns of fiscal-year reporting as respects distribution over the months from 
July to June, these exclusions are quite likely to distort the over-all pattern for 
all fiscal-year returns. And, of course, the total number of fiscal-year returns, 
as well as the aggregate for any particular accounting item, is seriously under- 
stated because of the exclusions. For these reasons, I have included no analyses 
of the 1925-1927 fiscal-year tabulations in any section of this report. That 
some useful inferences might be drawn from such analyses is not denied, but 
I am convinced that most analytical results for those years would not be com- 
parable with those for later years, 

Beginning with 1928, the special tabulations from fiscal-year returns did aim 
to cover all active fiscal-year returns, provided such returns “were received by 
the Statistical Section [of the Bureau of Internal Revenue] prior to termination 
of the tabulation of Statistics of Income data.’’ This proviso, which is stated in 
the text accompanying the tables for many years after 1927, presumably means 
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that certain fiscal-year returns—probably a smail number, and probably con- 
fined chiefly to the late months, such as June and possibly May or even April— 
were not included. I have neglected this deficiency in my analyses on the 
assumption that it is probably very small. Beginning with 1939, the tabulations 
also included some information on the inactive fiscal-year returns; but, as the 
present analyses are concerned only with active returns, this deficiency of 
coverage in the years 1928-1938 has no bearing upon the results below. 

For each year from 1928 to 1950, at least two fiscal-year tabulations are pub- 
lished in Statistics of Income, though the form may differ slightly from year to 
year. One of these distributes the number of fiscal-year returns and the amount 
of their net income (or deficit) by months in which the fiscal years ended, 
separately for returns with net income and returns with deficit. The other dis- 
tributes the number of fiscal-year returns and the amount of their net income 
(or deficit) among size classes of net income (or deficit), also separately for re- 
turns with net income and returns with deficit. For certain years of this period, 
additional tabulations are included. These supplementary tables are commented 
upon at more length at those points in following sections where they are ana- 
lyzed. The data they contain are more detailed as to accounting items included, 
industrial classes shown, or in other respects than the data of the two basic 
tables, but no continuously comparable supplementary tables were published 
for any substantial period of years before 1946. Instead, the supplementary 
tables in all the earlier years seem to be isolated special compilations, available 
for only one year, or at most two or three years. From 1946 to 1949 (but not 
for 1950, however, probably because of the greatly increased importance of 
fiscal-year reporting) Statistics of Income includes two sets of supplementary 
tables in a standardized form. Each set presents an elaborate breakdown by 
line of industry, and these data, along with certain other materials, are the basis 
of analyses reported in Part II. One set of tables relates to balance-sheet re- 
turns, and the aggregate figures for all balance-sheet returns without regard to 
line of industry are analyzed in Section 4 of this Part.? The balance-sheet re- 
turns are, however, for most purposes of analysis, a highly dependable sample 
of all returns. While this evidence pertains to all corporation returns, regardless 
of month to which the balance sheet applies, I know of no clear reason for 
suspecting that any different conclusion would apply to the balance-sheet re- 
turns of fiscal-year corporations as a sample of all returns of such corporations. 

2. Increasing use of fiscal-year reporting: number of returns. From the first of 
the two standard tabulations of fiscal-year returns in Statistics of Income, the 
total number of fiscal-year returns tabulated for each year can be obtained. 
The general tables, which constitute the main body of the compilations re- 
ported in Séatistics of Income combine all types of active returns—those for 





2 Those corporate income-tax returns accompanied by balance sheets are generally called “balance-sheet re- 
returns” here. The overwhelming bulk of all returns were accompanied by balance sheets, and the balance-sheet 
coverage for corporations showing large net incomes or large deficits was very nearly complete. In 1949 about 94 
per cent of returns showing net income supplied balance sheets, and about 84 per cent of returns showing no net in- 
come. The deficiency in balance sheets was largest for returns showing very small net income or very small deficit: 
86 per cent of returns with net income below $1,000 were accompanied by balance sheets, and nearly 78 per cent of 
those with deficits under $1,000 (figures from S. of J., 1949, p. 8). This situation was not significantly different in 
vears before 1949. 
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the calendar year, those for fiscal years, and those for part years.* Such a com- 
pilation for 1950 appears in Table 3 of Statistics of Income for 1950, Part 2, and 
the fiscal-year tabulations appear on pages 21-22. Table 1, below, taken from 
these tabulations, traces the changes in the fiscal-year share of the total over 
the years 1928-1950. From a level moderately above 12 per cent in 1928, the 
fiscal-year si.are in the total rose steadily except for a very slight dip in 1929 
and a moderate dip in 1933, to nearly 34 per cent in 1950. The rise per year was 
remarkably steady from 1938 to 1945, and exceptionally steep in the years 


TABLE 1 


SHARE OF ALL RETURNS FILED BY FISCAL YEAR, 
IN TERMS OF NUMBER, 1928-1950 


Number of Returns 
All Fiscal-Year 
(a) (b) 


443 ,611 54,820 
456 ,021 54,609 


463 ,036 59 ,202 
459 ,704 59 ,508 
451,884 59 ,459 
446 ,842 53 , 883 
469 , 804 67 ,056 


477 ,113 
478 ,857 
477 ,838 
471,032 
469 ,617 


SESS= 
ESs 28 


$2288 
SEESE 


421,125 
491 152 
551 ,807 
594 , 243 
614 ,842 


SSSse 


ee 
oo 
_ 


629 ,314 212,391 





’ Part-year returns apply to an accounting period shorter than twelve months, when the greater part of th 
period falis within the specified calendar year. Such returns represent reorganizations, newly organized corporations, 
liquidations, changes from calendar-year to fiscal-year basis, or vice versa, and presumably also changes from one 
fiseal year to a fiscal year ending in a different month. Changes in specification of reporting year are subject to ap- 
proval by the Commissioner of Internal Revenue. The fiscal-year returns included in the 1949 tabulations are those 
ending in any month from July to November 1949 and January to June 1950; and similarly, for tabulations of other 
years, fiscal-year returns range from July of the specified year to June of the succeeding year. This arrangement, by 
which the tabulations for any one year include returns for fiscal years extending to the following June, delays pub- 
lication of Statistics of Income. As amply shown in later sections, however, failure to include such returns would set 
the center of the average year well before July 1 and greatly impair the usefulness of the tabulations for making 
comparisons. One reservation on this point should be made: If the included fiscal years extended from June of the 
given year to May of the following year, one month would be saved in publication time, without any significantly 
worse location of the'center of the average year than at present. 
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1946-1948. This evidence strongly suggests that although fiscal-year reporting 
may have been negligible for most purposes of analyzing Statistics of Income 
data in the early years of the twenty-three-year period, it has now become a 
factor which cannot safely be ignored. As will appear in Sections 3 and 4, 
however, mere number of returns is not an entirely adequate indication of 
importance, and certain tests of other measures of importance are reported - 
there. 

3. Increasing use of fiscal-year reporting: amount of net income or deficit. The 
only measures of importance, other than number of returns, available for com- 
parisons between fiscal-year returns and all returns for every year from 1928 to 
1950 are aggregate amount of net income for returns showing net income, ag- 
gregate amount of deficit for returns showing no net income, and aggregate net 
income for both categories combined. These measures of “importance” are far 
from satisfactory, chiefly because of wide fluctuations from year to year in the 
amount of net income (or deficit) reported by corporations having a specified 
size or a specified volume of gross business, or meeting some other test of im- 
portance less susceptible to annual fluctuation than net income or deficit.® 

Despite these defects in net income (or deficit) as a measure of importance, 
it probably has more significance than the mere number of corporations, at 
least for some purposes. As shown in Part III, number of corporations may give 
misleading indications if fiscal-year reporting is to an important extent more 
(or less) commonly practiced by large than by small corporations. Therefore, in 
Table 2, I present certain comparisons based on net income (or deficit) as a 
measure of importance. The basic data for this purpose are taken from the 
same Statistics of Income tabulations, for each year, as the figures in Table 1, 
and the percentages are derived in the same manner. While the fiscal-year per- 
centages, both for net income and for deficit, run much higher in recent years 
than in the early years of the period, the year-to-year variations are much more 
irregular than those in Table 1. This is not surprising, in view of the above- 
noted possibility that cyclical variations in corporate earnings may have differ- 
ent impacts upon different lines of industry and that fiscal-year returns may be 
much more important in certain lines of industry than in others. Nevertheless, 
despite the irregular course of change over the twenty-three-year period, both 
the net-income and the deficit percentages give unmistakable evidence of a 
very large increase in the fiscal-year share of the total between the beginning 
and the end of the period. 





‘ Strictly, figures are also available for amount of tax on returns showing net income. But, because of many 
changes in *ax rates and other factors determining the amount of tax liability pertinent to a specified net income, 
figures on amount of tax can afford no dependable annual comparisons of the importance of fiscal-year reporting. 
For certain other purposes, such as studying the monthly fiow of revenues from the corporate income tax and re- 
lated taxes, the amount of tax on fiscal-year returns may, of course, be highly signifcant. 

5 Moreover, until we have examined some of the evidence presented in later sections, we cannot be confident 
that the net income (or deficit) figure for the fiscal year ending in a particular month—for example, January—is not 
exceptionally large or exceptionally small in comparison with the figure for all returns tabulated for the correspond- 
ing calendar year. Many corporations in particular industries, with a profit experience sharply different from the 
average for all industries, may happen to choose ‘hat fiscal year as the accounting period for their reporting. For 
example, retail stores in the Apparel and accessories group tend very commonly to use a fiseal year ending in Jan- 
uary. In these circumstances, our use of the aggregate net income for all fiscal-year corporations reporting for the 
period ending in January would be misleading. 
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TABLE 2 


SHARE OF ALL NET INCOME FOR RETURNS WITH NET INCOME AND 
ALL DEFICIT FOR RETURNS WITH NO NET INCOME REPORTED ON 
FISCAL-YEAR RETURNS, 1928-1950 


(dollars in millions) 


Returns Showing Net Income Returns Showing No Net Income 
Amount of Net Income Amount of Deficit 
All Fiscal-Year Fiscal-Year d/e, 
(a) (b) % 
$10,618 $1,229 15.1 
1,212 14.4 
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1950 44,141 


In slightly over half of the twenty-three vears the deficit percentage is above 
the net-income percentage, and in several years the difference between the two 
is striking. This may mean that in these years corporations with large deficits 
were more likely to file fiscal-year returns than were those with large net in- 
comes, but, at this stage, no sure inference on this point can be drawn. (See 
discussion in terms of a classification of corporations, both fiscal-year and all, 
according to size of net income or deficit, in Part IIT.) 

4. Increasing use of fscal-year reporting: total assets. The most satisfactory 
general-purpose measure of the importance of corporations is total assets. This 
measure, whether applied to one corporation or a group of corporations, is not 
entirely stable, but it is mainly free of those sharp and diverse cyclical varia- 
tions which detract from the usefulness of net income (or deficit) as a measure 
of importance, particularly for comparing different groups of corporations. The 
total-assets measure, however, has shortcomings which may in some instances 
prove serious, chief among them the peculiar shape of the size distribution of 
corporations, to which some attention is given in Part III. That shape is marked 
by a dense clustering of corporations in the lowest size class in terms of total 
assets, and also by an immense range of size, with a very small number of 
corporations showing huge total assets. As a result, a handful of corporations, 
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or even a single corporation, may dominate the aggregate total assets of a 
particular class of corporations, such as those in a particular industry. The 
aggregate is not dependably representative of the more numerous and typical 
corporations in the class, and this type of distortion may vary sharply from 
class to class, and possibly between fiscal-year and other returns. 

The total-assets figure is available only for balance-sheet returns. As already 
noted in Section 1, one of the sets of supplementary tables of fiseal-year figures 
in Statistics of Income fer the years 1946-1949 shows, by industrial classes, data 
compiled from fiscal-year returns which were accompanied by balance sheets. 
The data comprise only two items, number of returns and total assets; these 
are given for each of the eleven fiscal-year periods and for all such periods com- 
bined, separately for net-income returns and deficit returns. From these figures 
for the entire corporate system without regard to industry, we can derive the 
totals shown in columns b and d of Table 3, for 1946-1949. The corresponding 
figures for all balance-sheet returns are from Table 4 of Statistics of Income for 
these years. Supplementary tables such as those for 1946-1949 do not appear in 
Statistics of Income for any earlier year, but the issue for 1934 includes a special 
tabulation of fiscal-year balance-sheet returns according to size of total assets 
(see Part III for discussion of these size di: iributions). With these data, the 
aggregate total assets of fiscal-year balance-sueet returns for 1934 can be esti- 
mated. 


TABLE 3 


BALANCE-SHEET FISCAL-YEAR RETURNS COMPARED TO ALL BALANCE- 
SHEET RETURNS, IN TERMS OF NUMBER AND TOTAL 
ASSETS, 1934 AND 1946-1949 


(dollars in millions) 


Number of Returns Total Assets 
All Fiscal-Year a All 
(a) (b) (c) (d) 


1934° 410,626 62,794 . $301,307 26,770 
1946 440 ,750 126 ,854 . 454,705 
1947 496 ,821 160,501 : 501,315 
1948 536 , 833 186 ,381 s 525 ,136 67 ,695 
1949 554 ,573 199,912 F 543 ,562 


* Total assets of fiscal-year returns, 1934, estimated as explained in Appendix D. 


Table 3 traces the fiscal-year share of balance-sheet returns in terms of num- 
ber and of total assets from 1934 to 1949. The percentages in terms of number 
for the five years shown run slightly above the percentages for corresponding 
years in Table 1. This simply means that fiscal-year returns were somewhat 
more likely to be accompanied by a balance sheet than were other returns. In 
the fifteen years covered, the percentage in terms of number of returns in- 
creased from 15.3 to 36.0—about 135 per cent. The corresponding increase of 
the percentage in terms of total assets was from 8.9 to 13.2—about 48 per cent. 
That this increase is much less striking than the increase in terms of number 
implies that the huge expansion in fiscal-year reporting from 1934 to 1949 had 
a relatively greater impact upon small than upon large corporations. Specifi- 
cally, we know that in zertain industries—banking, insurance, and public 
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utilities—which include some of the largest corporations, very few of the large 
corporations filed on a fiscal-year basis in 1934 or shifted over to that basis in 
later years. If we exclude the Finance and Public-utilities divisions from the 
total list of corporations in 1934, the percentages become 17.7 in terms of 
number and 18.5 in terms of total assets. These correspond to 40.1 and 29.2 for 
1949. Accordingly, after these exclusions, the ratio in terms of number rose 
from 17.7 to 40.1, or 126 per cent, between 1934 and 1939, and the ratio in 
terms of total assets rose from 18.5 to 29.2, or 58 per cent. 

Whether we examine the change from 1934 to 1949 on the all-inclusive basis 
or on the restricted basis excluding finance and public utilities, the share of 
fiscal-year returns in the total was much greater in 1949 than in 1934. The in- 
crease in terms of number of returns is more striking than in terms of total 
assets, but even in terms of tutal assets, and especially on the restricted basis, 
the increase is very substantial. Nearly 30 per cent—the average in terms of 
assets for a wide sweep of industries other than public utilities and finance in 
1949—is assuredly high enough to support our earlier conclusion that one can 
no longer regard fiscal-year returns as a negligible element in analyzing and 
interpreting corporate tabulations in Statistics of Income. 

5. Monthly distribution of accounting years: number of returns. As indicated in 
Section 1, one of the two standard tables of fiscal-year figures available for 
every year from 1928 to 1950 is the distribution of fiscal-year returns according 
to the terminal month. Statistics of Income shows also, for each year, the number 
of part-year returns, as well as the figures for all returns without regard to 
accounting period. From these data, we can derive a percentage distribution of 
all returns among the following accounting periods: calendar year, fiscal years, 
separately for each of the eleven fiscal-year periods ending in the months July— 
November and January-June, and part years.* In 1950, slightly over 60 per 
cent of the returns were filed for the calendar year, and slightly under 34 per 
cent were filed for the various fiscal years ending from July 1950 to June 1951. 
Of the fiscal-year returns, the chief concentrations were for years ending in 
June, with 5.54 per cent of the over-all total; March, with 4.12 per cent; and 
September, with 3.62 per cent. The percentage for each of the other fiscal-year 
months fell somewhere in the range from 2.20 toe 2.87; and this reflects a fairly 
even spread of fiscal-year returns among all the months except June, March, 
and September. Part-year returns amounted to 6.18 per cent of the over-all 
total, but we have no means of identifying, from the published tabulations, the 
lengths or terminal dates of the relevant accounting periods.’ 

Similar computations for each year from 1928 to 1950 appear in Table 4. In 





* In certain instances, some departure from this standard scheme of allocation becomes necessary, and this will 
be pointed out at the time. It should be noted here, however, that part-year returns can be separated only in con- 
nection with the entire corporate system. Because of the very limited form in which part-year returns are tabulated in 
S. of I., allocation is not possible according to size or line of industry, or for balance-sheet returns as a whole and in 
various subclasses. Therefore, except for over-all analyses for each year in this section and the analyses in Tables 5 
and 7, the December figure includes not only the returns for the calendar year but also all part-year returns. 

? The length of a part year can apparently range from one to eleven months. Part-year returns are tabulated as 
pertaining to a particular year if “the greater part of the income period” falls in that year (see S. of I., 1949, p. 32). 
This makes clear to which year a part-year return which overlaps January 1 and includes an odd number of months 
would be allocated, but it does not clearly cover part-year returns including an even number of months (see also 
Section 7, and Appendix C). 
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the last column of this table, the 1928 percentages are repeated, to facilitate 
comparison with those for 1950. The 1950 and 1928 columns give striking evi- 
dence of the great shifts in accounting periods over twenty-three years. The 
percentage for the calendar-year returns (December figures) is about three- 
fourths as high in 1950 as in 1928.* The percentage for each of the eleven fiscal- 
year terminai months shows a sharp increase over the twenty-three-year period, 
though the degree of this increase varies widely among the eleven months. The 
percentage for January nearly doubled, while that for September was quad- 
rupled. Although the course for each of the twelve months is not completely 
free from interruption, each month shows a surprisingly steady course. After 
some irregularities in the first six years, the December percentage declines 
without any interruption in every year following 1933. Similarly, the percentage 
for each of the other eleven months shows, except for isolated interruptions, a 
steady upward course. 

In 1933, each of the eleven fiscal-year months shows a decline from 1932.° 
In no other year do we find such a uniform dip. In 1946 nearly all of the five 
early fiscal-year months (July-November) show declines from 1945, but all of 
the six late fiscal-year months (January—June) show emphatic increases from 
1945. The number of fiscal-year returns increased from 1945 to 1946 for each of 
the eleven months, but such increases were much smaller for the months before 
December than for the six months following.'” 

6. Monthly distribution of accounting years: in terms of selected accounting 
items. The preceding section discusses the monthly distribution merely in terms 
of number of returns; but, for various purposes, greater interest may attach to 
the distribution, among the various accounting periods, of the compiled ag- 
gregate of some particular accounting item, such as net income (or deficit), or 
total assets, or total receipts. By methods similar to those used to derive the 
percentages of Table 4, we can allocate the total net income of net-income 
corporations and the total deficit of deficit corporations among the various 
accounting periods. Such percentage allocations for 1928 and 1950, as well as 
the allocations of the net income minus deficit of the net-income and deficit 





§ The same is approximately true of the part-year percentage, although this has relatively little bearing on 
changes in the fiscal-year percentages. The part-year percentage is subject to peculiar variations: for example, its 
exceptionally high values in 1946 and 1947 probably reflect the remarkable increase in new corporate charters fol- 
lowing the war. Likewise, the very low levels in 1942-1944 probably reflect the comparative absence of new charters 
in those years. But we must not forget that part-year returns also frequently arise when corporations are going out of 
business, and many such disappearances occurred in the war years. Without more elaborate compilations from part- 
year returns, we can do little better than guess. 

® I have already noted, in discussing Table 1, that a dip in the number of fiscal-year returns occurred in 1933, 
and one may wonder whether the tabulations for fiscal-year returns, like those for part-year returns (see Table 4, 
footnote a), were incomplete for 1933. I have, however, found no note in any later issue of S. of I. to indicate such a 
deficiency. 

10 The great wave of new incorporations following the war resulted in the total number of active returns (regard- 
less of accounting period) rising from 421,125 in 1945 to 491,152 in 1946. Conceivably this outburst of chartering 
led to the commencement of many corporate businesses in the spring of 1946; these might then appear in the 1946 
tabulations as having fiscal years ending in various 1947 months up to June. Moreover, the extensive shift from 
partnership (or other unincorporated) forms of business to the corporate form in 1946 may have created more cor- 
porations with fiscal years ending after than before December (see W. L. Crum, Age Structure of the Corporate System, 
University of California Press, 1953, pp. 114, 122). Whatever the cause, the 1945-1946 increase in number of fiecal- 
year returns was at a greater rate in the months January to June, and at a smaller rate in most of the months July 
to November, than the 1945-1946 change in the over-all total from 421,125 to 491,152. This explains the peculiar 
1946 percentages in Table 4. 
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° The part-year percentage for 1933 is estimated as the approximate average of such percentages for 1931, 


1932, 1934, and 1935. The part-year percentage derived from S. of I., 1933, is 0.40, but on page 34 of the 1934 issue 
there is a note that this is a serious understatement and corrected figures cannot be supplied. Any error in my 


estimate causes an equal error in the opposite direction in the December figure, since the latter is obtained by sub- 


tracting part-year returns from al! non-fiscal-year returas. 
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categories combined, appear in Table 5." Separate treatment of the two cate- 
gories—net-income corporations and deficit corporations—is preferable, since 
the offsetting of negative figures against positive figures can frequently yield 
misleading impressions of relative importance, because of haphazard variations 
in a residual obtained by subtraction. This danger is particularly serious in 
studying fiscal-year returns because in different industries the buik of such 
returns tend to have different terminal months. If one of these industries is 
mainly showing net income and the other mainly deficit, the use of figures com- 
bining net-income and deficit categories may conceal significant relationships 
(see further discussion on this point in Part II). 

Table 5 shows that in 1950, for each of the fiscal-year months except January, 
the net-income percentage is lower than the deficit percentage. Approximately 
similar relationships hold for 1928. Correspondingly, of course, the 1950 calen- 


TABLE 5 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF NET INCOME OR 
DEFICIT FOR RETURNS WITH NET INCOME OR NO NET 
INCOME AND BOTH CATEGORIES COMBINED, 1928 AND 1950 


Net Income, for Net Deficit, for Returns with 
Income Returns with No Net Income Both Combined 


1928 1950 


July 0.48 
August 0.96 
September 0.76 
October 1.04 
November 1.48 
86 .33 

1.42 
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0.83 
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dar-year percentage is much higher in terms of net income than in terms of 
deficit: 76.62 against 56.12. The full implications of this difference cannot be 
understood without further knowledge of differences in fiscal-year reporting 
according to line of industry and to size of enterprise, as examined in Parts IT 
and III. We may at this stage merely suggest that in 1928 and 1950 calendar- 
year returns were more likely to show net income than were returns filed for 
other accounting periods. 

In the net-income returns figures of Table 5, we note that, accompanying a 
large decline in the calendar-year figure and a small decline in the part-year 
figure, the figures of each of the fiscal-year months experienced a sharp ad- 
vance from 1928 to 1950. The same conclusions can be drawn from the figures 
for the deficit returns, except for the increase in part-year returns. This twenty- 
three-year increase in fiscal-year reporting has therefore not resulted from an 
expansion limited to any one accounting period, such as that ending in October, 
but has appeared in all of the non-calendar twelve-month accounting periods. 





u For one important purpose, discussed below in Section 7, such a set of figures for both categories combined— 
obtained by subtracting each deficit from the corresponding net income—is useful. 
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A similar study of the monthly distribution of the balance-sheet returns, in 
terms of number of returns and of total assets, is possible for the years 1946- 
1949. The analysis is carried out on the same lines, except that the relevant 
figures for part-year returns are not available for balance-sheet returns and 
therefore cannot be deducted from the non-fiscal-year figures to yield the true 
calendar-year figures. The results, for 1946 and 1949, are shown in Table 6. Both 
for number of returns and for total assets the percentage rises from 1946 to 
1949 for each of the fiscal-year months (except November, for total assets), 
whereas it declines for December, which includes both calendar-year and part- 
year returns. 

Looking at 1946 and 1949 separately, we find the total-assets percentages are 
uniformly lower than the number-of-returns percentages for all fiscal-year 


TABLE 6 
PERCENTAGE DISTRIBUTION BY FILING PERIOD OF NUMBER OF 


BALANCE-SHEET RETURNS AND TOTAL ASSETS TABU- 
LATED FROM BALANCE SHEETS, 1946 AND 1949 


Number of Returns Total Assets 

1946 1949 1946 1949 
July 1.84 2.60 0.74 0.88 
August 2.04 2.80 0.84 1.03 
September 2.60 3.96 1.05 1.38 
October 2.26 2.91 1.29 1.46 — 
November 2.00 2.24 1.20 1.20 
December® 71.22 63.95 88 .66 86.81 
January 2.52 2.90 1.45 1.56 
February 2.00 2.42 0.55 0.64 
March 3.08 4.03 0.79 1.09 
April 2.66 3.09 0.74 0.86 
May 2.54 2.86 0.68 0.78 
June 5.25 6.23 2.01 2.31 


* Includes calendar-year and all part-year returns. 


months, and the reverse is true for the December percentages. This clearly im- 
plies that, on the average, fiscal-year returns have lower total assets than other 
returns, that fiscal-year reporting is more prevalent among smaller than amcng 
larger corporations. (See Part III for further evidence on this point.) 

No tables published in Statistics of Income for any year before 1946 afford 
any basis for determining the monthly distribution of fiscal-year returns in 
terms of such a generally satisfactory measure of importance as total assets. 
For 1928, 1929, and 1930, however, supplementary tables show aggregate 
figures for numerous income-account items for the fisca]-year returns filed for 
each of the fiscal-year months, and for part-year returns. Such fiscal-year ag- 
gregates can be compared with corresponding aggregates for the entire list of 
returns—regardless of accounting period—as customarily presented in the main 
tables. of Statistics of Income. One income-account item, net income (or deficit) 
has already been examined in an earlier section. We now examine another item: 
total compiled receipts. This is essentially the item showing total gross income 
of the corporations, although as a matter of fact some elements making up the 
item, such as dividends or interest received, are not truly “gross” in the same 
sense as the gross sales or gross receipts from operations. Nevertheless, total 
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compiled receipts is fairly indicative of the gross volume of business, and it is 
much less susceptible to wide fluctuations than is net income (or deficit). It is 
accordingly a better measure of importance than net income (or deficit), but 
probably not as stable a measure as total assets. Because of differences in the 
average rate of turnover of assets among lines of industry, and because of the 
tendency for certain lines to have their fiscal-year returns concentrated in ac- 
counting periods ending in particular months, this measure may give a picture 
of the monthly distribution very different from that afforded by total assets. 

Table 7 gives the percentages for the various accounting periods, obtained 
by methods essentially similar to those used for Table 4." The changes from 
year to year, for any particular month, are fairly small, and they are sufficiently 
irregular in amount and direction to afford no apparent generalizations. The 


TABLE 7 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF TOTAL 
COMPILED RECEIPTS, 1928-1930 


1928 1929 1930 
July 0.98 0.87 0.93 
August 1.03 1.02 1.00 
September 0.94 0.97 0.98 
October 3.12 3.25 3.32 
November 1.59 1.47 1.45 
December 79.20 79.31 80.35 
January 2.43 2.61 2.66 
February 1.61 1.46 1.61 
March 1.20 1.14 1.02 
April 1.03 1.04 0.94 
May 1.39 1.17 1.07 
June 3.01 2.22 2.54 
Part-year 2.46 3.48 2.13 | 


sharp cyclical changes in business activity in these years probably affected 
different lines of business differently, and this may account for some of the 
differences among the various months in the course of change over the period 
1928-1930. 

Since no such tables have appeared for fiscal-year returns in any issue of 
Statistics of Income since 1930, no picture can be given of the long-run shifts in 
the shape of the monthly distribution of total compiled receipts. Moreover, as 
noted above, these percentages for the years 1928-1930 are not properly com- 
parable with percentages for total assets for the years 1946-1949. Both total 
compiied receipts and total assets are fairly general measures of importance, 
but this does not mean that they are interchangeable measures. 

7. Estimate of the average year. The findings in the foregoing sections are of 
interest in summarizing certain facts about the importance of fiscal-year report- 
ing in the corporate system as a whole, the changes in the fiscal-year share over 
the years, and the monthly distribution of the accounting periods. These facts 
have various important :mplications, but we are now concerned with a par- 
ticular implication. What do the monthly shape of fiscal-year returns and the 
progressive change in that shape imply as to the average accounting period 
represented by the comprehensive tables in Statistics of Income, compiled from 





12 Here, as in the case of Table 4, data for part-year returns are available, and the December figure pertains 
only to calendar-year returns. 
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all returns regardless of accounting period? As indicated in certain introductory 
paragraphs of this report, this question may have high practical importance for 
various users of such tables. The effects on the average accounting year are of 
two sorts: a dislocation of its center from July 1, the center of the calendar 
year, and a modification of the indicated intensity of cyclical variations as re- 
flected by the tabulated annual data. The first of these can be measured with 
fair precision, and is treated in this and certain later sections of the report. The 
second is much more elusive, and is discussed on a suggestive basis in Ap- 
pendix B. 

To determine the extent to which the center of the average year differs from 
July 1, we note that the center of a fiscal year ending on July 31 is February 1, 
five months before July 1. Similarly, the center of each of the ten other fiscal- 
year periods falls a specified number of months before or after July 1. These 
departures range from —5, through 0 for the calendar year, to +6, for the 
twelve different twelve-month accounting periods. If, for a particular year—for 
example, 1928—we weight each of these departures by its percentage as shown 
in Table 4, and calculate the weighted average, we have the number of months 
by which the center of the average year departs from July 1, 1928." 

Part-year returns may be filed for any of 132 possible part-year periods rang- 
ing in length from one to eleven months and having terminal dates ranging from 
January 31 of the given calendar year to May 31 of the following year. No 
detailed breakdown of the part-year returns—according to length of the ac- 
counting period or terminal date—has ever been published in Statistics of In- 
come. We here assume that if we take all the part-year returns together, the 
center of the composite group of part-year accounting periods is July 1 and has 
therefore a departure of 0. As is shown in Appendix C, the basis for this assump- 
tion is that the most probable distribution of the part-year periods—as to 
length and dating—would yield approximately this result. We must remark, 
however, that this “most probable distribution” is not very probable; wide 
variations from it can in actuality exist, and such variations might shift the 
center of the composité away from July 1. The resulting departure might be 
significant, but a departure as large as one month would be surprising. We can 
only guess, but it is highly improbable that the departure of the composite 
center from July 1 would be large enough to alter seriously the end result, the 
center of the average year for all returns. 

When computed in this way, the center of the average accounting period, 
based on number of returns, for all types of 1928 returns together, fell 0.186 
month—or less than six days—later than July 1. This is a negligible departure: 
the assumption that Statistics of Income tables belong to a year centered at 
July 1 appears to be satisfactory for 1928. 

A similar analysis for 1950 yields a weighted average departure of 0.369 
month, or about ten days, after July 1. This may not be a negligible departure, 
but it is certainly not large, and, for most purposes involving comparisons of 

The average accounting period of the returns tabulated for any year may differ according to the ecverage of 
the returns: the average for the corporations of a particular line of industry, or of a particular size class, may differ 
from the average for the entire corporate system. Moreover, as appears from later examples in the text, the average 
differs according as it is based upon uumber of returns, amount of net income (or deficit), total assets, ur some other 
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corporate figures with various economic factors, a dating error up to ten days 
may well be less serious than various other errors which affect such comparisons. 
We must bear in mind that the present results, based upon number of returns 
as the weighting element, are tentative; the use of some more appropriate 
scheme of weighting might indicate a more serious average departure (see 
Table 8). 


TABLE 8 


CENTER OF AVERAGE ACCOUNTING PERIOD FOR THE ENTIRE COR- 
PORATE SYSTEM, USING VARIOUS WEIGHTING BASES, SELECTED YEARS 


Months by Which 
Central Date 
Follows July 1 

Number of returns All 1928 -186 

1950 369 


Tabulating 


Weighting Element Returns Covered Year 


Net income With net income 1928 .140 
1950 -207 


Deficit With no net income 1928 .192 
1950 -187 


All 1928 124 
1950 -209 


With bnalence sheets 1046 -093 
1947 107 
1948 005 
1949 .096 


Total compiled receipts 1928 . 187 
1929 .130 
1930 14 


In judging the significance of the weighted average departure, the degree of 
scatter (dispersion among the datings of the accounting periods must be 
known. The desired measure of scatter among the centers of the accounting 
periods is the standard deviation. This is calculated by the customary method, 
and yields, for 1950, 2.19 (months). Under favorable circumstances, this stand- 
ard deviation may be interpreted as follows: Chances are about two out of three 
that the center of a particular accounting period will fall within a range of 2.19 
months on either side of the center of the average accounting period, which was 
found above to be about ten days after July 1 in 1950.“ But such favorable 
circumstances do not exist in this situation, for we know that a very large 
number of returns—-60.07 per cent of the total—have centers falling precisely 
at July 1. A more nearly precise statement for the 1950 returns would be: The 
60.07 per cent which are calendar-year returns have their centers precisely at 
July 1; the 33.75 per cent which are fiscal-year returns have their average center 
1.094 months after July 1, and have a standard deviation of 3.69 months about 
that average; and the 6.18 per cent which are part-year returns have their 





“4 Actually, the standard deviation, at 2.19, is seriously understated in this calculation, because we have treated 
all the part-year returns as though their centers were at July. 1 While their average is probabiy approximately at 
July 1, as shown in Appendix C, the centers of specific part-year periods are scattered from early 1950 to late 1950, 
and the part-year returns would therefore contribute substantially to any true evaluation of the standard deviation. 
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average center very close to July 1, and have an unknown standard deviation 
about their average. 

Using the methods by which we arrived at the first figure, 2.19, for the 1950 
standard deviation, we find a standard deviation of 1.38 months for 1928. As 
would be expected, in the earlier year the dispersion about the average center is 
much smaller than in 1950. If the recent expansion in the use of fiscal-year 
reporting continues after 1950, we may expect that the dispersion of the centers 
of the various accounting periods about their average center will continue to 
increase, and the assumption that the whole system of returns may be treated 
as having an average accounting period centering at a stated date—no matter 
how close to or far from July 1—will become increasingly less dependable. The 
increased dispersion within the system detracts from the usefulness of any 
average for the whole system. 

Thus far the weights used in locating the center of the average accounting 
period have been percentages in terms of number of returns. We can, of course, 
carry out similar calculations of the central date for various years by using as 
weights any of the measures of importance heretofore examined. Table 8 shows 
the results—including those already found for number of returns—for net in- 
come, deficit, difference between net income and deficit, total assets, and total 
compiled receipts. 

These results show considerable differences according to type of weighting 
system, and between the specified years for any one system. But none of the 
figures in the final column implies any wide departure of the average center 
from July 1—the greatest departure is about ten days for 1950 under the num- 
ber-of-returns weights. For the corporate system as a whole, we find then that 
the average accounting period has had a center falling only a few days after 
July 1. Except for the warning that dispersion detracts from the dependability 
of sucli an average, we may therefore say that the assumption that the Statistics 
of Income tabulations for the corporate system as a whole pertain to an average 
year centered at July 1 of the year of tabulation is approximately valid. Even 
with the recent great expansion in fiscal-year reporting, the validity of this 
finding has not been seriously impaired. We shall see in Part I that this finding 
does not necessarily remain valid for certain lines of industry within the entire 
corporate system, when studied separately. 


PART II. DIFFERENCES AMONG LINES OF INDUSTRY 


8. The industrial classification. Issues of Statistics of Income for 1946-1949 
include tabulations from fiscal-year returns classified by major industrial 
groups. These are the same groups, except for occasional changes in the break- 
down of classes, that are presented in the principal tables in Statistics of Income 
for all returns (and all balance-sheet returns), regardless of dating of the ac- 
counting periods, 1938 to 1950. The data for fiscal-year returns for 1946 to 1949 
appear in two tables, oae for all returns whether or not accompanied by balance 
sheets, and the other for balance-sheet returns. Each table appears in two 
parts—one for returns with net income, the other for returns with no net in- 
come. The first table shows, for each industrial class, number of returns and 
amount of net income (or deficit); the second table shows number of returns 
and amount of total assets. 
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The 1949 (and 1948) tables include 77 lines, each showing figures for all 
fiscal-year returns and a breakdown for the eleven fiscal-year months." The 
first of these lines is for the entire corporate system, regardless of type of indus- 
try. Nine of the lines relate to 8 broad industrial divisions—Agriculture, Min- 
ing, Construction, Manufacturing, Public utilities, Trade, Finance (including 
insurance, real estate, and lessors), and Services—and an unclassified division 
of returns not allocable among the specific divisions. The remaining lines of the 
table pertain to various groups within each division (except Construction) and 
to various subgroups within certain of the groups of the Trade and Finance 
divisions. As 12 of the lines represent subtotals of classes shown in other lines 
of the table, the industrial classification actually includes in 1949 65 mutually 
exclusive line-of-industry classes; of these, 2 are not-allocable classes. Eliminat- 
ing the 2 not-allocable classes (numbers 56 and 77), and eliminating 4 classes 
combined into other classes in Table 9, 59 mutually exclusive classes are left. 
This provides a fairly detailed breakdown by line of industry. 

The term “mutually exclusive” must not, however, be read literally. The 
classification of corporate tax returns is not as clear-cut as it appears super- 
ficially, because any particular corporation is in fact classified by its principal 
business activity, defined in income-tax Form 1120 as the activity which ac- 
counts for the largest percentage of “total receipts.” As many corporations, 
especially the larger ones, engage in more than one line of activity, the entire 
industrial classification of the corporate data is seriously lacking in precision. 
Even for the broad divisions, considerable lack of precision exists; for example, 
a single corporation may in fact be engaged in mining, manufacturing, and 
trade, and yet its entire account will necessarily be tabulated in only one of 
these divisions. A possibility exists that, for the broad divisions, some of these 
classification errors are partially offset by other classification errors in the 
opposite direction, with the result that the accounting aggregates for a particu- 
lar division may not be seriously in error as reflecting conditions in that broad 
line of industry. For the more finely divided classes (the groups and subgroups), 
however, the possibility that the classification errors tend to offset each other 
is generally much less substantial. One must therefore be careful not to over- 
look the possible classification errors in any interpretation of the corporate 
tabulations as reflecting conditions in specified lines of industry. 

For one table appearing in Statistics of Income—not, however, a table for 
fiscal-year returns—a much more detailed industrial classification is presented.'? 
This table shows 279 lines, including the 77 classes noted above. The additional 








% Because of minor differences in the scheme of classification, the tables for 1946 and 1947 show 86 lines. To 
provide comparability of 1946 and 1947 figures with those for 1948 and 1949 in my Table 9, I have combined 23 of 
the 86 lines of the earlier years into ten composite classes, and 7 of the 77 lines of the later years into three composite 
classes (see footnotes to Table 9). 

% The word “error” is used here, not in the sense of a mistake in the assignment of a corporation to a cless, but 
in the sense of the unavoidable inclusion in a particular class of the results of operations in lines of activity outside 
that class. Errors, in the sense of mistaken assignment to class, can of course exist. Form 1120 calls upon the reporting 
corporation to classify itself; and, although effort to classify properly is probably generally made, we cannot be sure 
that the classification invariably reflects the situation accurately. Moreover, as this is an item of information on the 
tax return which ordinarily has no part in determining the tax liability, we may suspect that somewhat less care 
is taken in reporting it than in reporting the facts essential to calculation of the tax. Finally, the class in which a 
particular corporation falls may change from year to year, as the identity of its principal business activity changes. 

17 8. of I., 1949, pp. 70-79. See also an analysis of a similar table for 1946 in W. L. Crum, Age Structure of the 
Corporate System, University of California Press, pp. 165-175. 


TABLE 9 
FISCAL-YEAR RETURNS AS A PERCENTAGE OF ALL BALANCE-SHEET 
RETURNS IN TERMS OF NUMBER AND TOTAL ASSETS, AND AS A 
PERCENTAGE OF ALL NET INCOME OR DEFICIT FOR RETURNS WITH 
NET INCOME OR NO NET INCOME AND BOTH CATEGORIES COM- 
BINED, BY INDUSTRIES, 1946 AND 1949 


Balance-Sheet Returns 


Total 
Assets 
1949 


. All industrial groups 
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Beverages 
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Textile-mill products* 

Apparel and products ade from 
fabrics 

Lumber and wood products, except 
furniture 

Furniture and fixtures 

Paper and allied products 

Printing, publishing, and allied in- 
dustries 

Chemicals and allied products 

Petroleum and coal products 

Rubber products 

Leather and products 

Stone, clay, and glass products 

Primary metals, and fabricated 
metal products except machin- 
ery®? 

Machinery, except transportation 
equipment and electrical 

Electrical machinery and equipment 

Transportation equipment, except 
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Motor vehicles and equipment, ex- 
cept electrical 
Other manufacturing*:’ 
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37. Public utilities 
Transportation 
Communication 
Other public utilities’ 


. Trade 

Wholesale 
Commission merchants 
Other wholesalers 

Retail 
Food 
General merchandise 
Apparel and accessories 
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TABLE 9—(continued) 
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Furniture and house furnishings 
Automotive dealers and filling sta- 
tions*® 


Drug stores 

Eating and drinking places 
Building materials and hardware* 
Other retail trade* 
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Trade not allocable 


. Finance, insurance, real estate, and 
lessors of real property 
Finance 
Banks and trust companies 
Credit agencies other than banks* 
Holding and other investment com- 
panies® 
Security and commodity-exchange 
brokers and dealers 
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Insurance carriers and agents 
Insurance carriers 
Insurance agents and brokers 
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Real estate, except lessors of real 
property other than buildings 

Lessors of real property, 
buildings 

Services 

Hotels and other lodging places 

Personal services 

Business services 

Automotive repair services and 
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4.1 
40.5 


36.5 
36.0 25.5 


7.5 
34.4 
36.6 
32.7 
24.5 


Ce a] 


37.1 37.9 


- 


garages 
Miscellaneous repair services, hand 
trades 
74. Motion pictures 
75. Amusement, except motion pictures 
76. Other services, including schools* 


34.8 28.3 
41.8 53.5 
39.7 48.7 
37.3 32.2 
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77. Nature of busi not allocabl 180.5 





® To provide comparability with 1948 and 1949, each of these classes combines certain classes shown separately 
in S. of I., 1946 and S. of I., 1947. The titles (and serial numbers) shown above, except for classes also marked 6, 
are those given in S. of I., 1949. The S. of I. classes included in each composite class are as follows: 
11. Nonmetallic mining and quarrying; Mining and quarrying not allocable 
17. Cotton manufactures; Textile-mill products, except cotton 
29. Iron, steel, and products; Nonferrous metals and their products 
36. Other manufacturing; Manufacturing not allocable 
51. Automotive dealers; Filling stations 
54. Hardware; Building materials, fuel, and ice 
55. Package liquor stores; Other retai’ trade; Retail trade not allocable 
60. Long-term credit agencies, mortgage companies, except banks; Short-term credit agencies, except banks; 
Finance not allocable. 
61. Investment trusts and investment companies; Other investment companies, inciuding holding companies; 
Other finance companies 
76. Other service, including schools; Service not allocable 


> To provide comparability with 1946 and 1947, each of these classes combines certain classes shown separately 
in S. of I., 1948 and S. of I., 1949. The titles shown above for these classes are adapted by rough combinations of 
the titles shown in S. of I., 1949 for the classes thus combined. The S. of I., classes included in sach composite -)-3s 
are as follows: 

29. Primary metal industries; Fabricated metal products, except ord , machinery, and transportation 

equipment; Ordnance and accessories 
36. Scientific instruments, photographic equipment, watches, clocks; Other manufacturing 
41. Electric and gas utilities; Other public utilities 
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202 lines give subclasses for most of the 77 classes called “major groups” in the 
general Statistics of Income tabulations. These additional data are somewhat 
helpful in determining the make-up of some of the major groups examined 
below. 

9. Industrial differences in fiscal-year reporting: number of balance-sheet re- 
turns. In examining the share of fiscal-year returns among all returns for each 
specified line of industry, we shall first give attention to the balance-sheet re- 
turns. A glance at the first two columns of Table 9 reveals that without a single 
exception the 1949 percentage was above that for 1946. In other words, the 
1946-1949 general expansion of fiscal-year reporting noted in Part I appears to 
have occurred in all lines of industry, though it was much greater for some 
classes than for others. 

Here, however, we are expecially interested in differences among lines of 
industry in any one year, and present remarks are confined to the figures for 
1949. First, we may note important differences among the eight broad divisions, 
as follows: 

2. Agriculture J 37. Public utilities 
6. Mining é 42, Trade 


12. Construction ‘ 57. Finance 
13. Manufacturing $ 68. Services 


For three divisions, Mining, Public utilities, and Finance, the percentages are 
below the over-all figure of 36.0 for the entire corporate system, and for the 
other five divisions, above. The highest figure is for Agriculture, which may 
reflect the fact that, except perhaps in Forestry, enterprises in this line are 
generally subject to a seasonal pattern which encourages fiscal-year reporting. 
We shall find other lines of industry—though not entire divisions—in which the 
accounting period tends to have its terminal month in the slack season. The 
next highest ratios are for Trade and Manufacturing, and we can best attempt 
to explain these high percentages later in examining the evidence for groups or 
subgroups within these divisions. The figures for Construction and Services are 
barely above the over-all figure of 36.0, and at this stage we merely remark that 
these divisions show no peculiar tendency as to the degree of fiscal-year re- 
porting. 

Among the divisions with percentages below the over-all figure, I can suggest 
no explanation for Mining; but note that the percentage for every group within 
that division is also below the over-all figure. The lowest percentages are for 
the Finance and Public utilities divisions, and the probable chief explanation 
in each case is that many of the enterprises covered are subject to supervision 
and regulation by public authority. Insofar as the regulatory authorities require 
reporting on a calendar-year basis, one may suppose that the corporations 
would tend to file for taxes on the same basis. An important group within the 
Finance division is Insurance, and those life insurance companies which file on 
Form 1120L and those mutual insurance companies which file on Form 1120M 
apparently have no choice but to file for the calendar year.'* This presumably 





18 Instruction B on Form 1120L (and Form 1120M) for 1949 reads: “The return shall be for the calendar year 
ending December 31, 1949, and the net income computed on the calendar year basis in accordance with the state 
laws regulating insurance companies.” S. of I., 1949, pp. 459 and 465. I have discovered no other instance in which 
the reporting requirements of the income tax specifically limit the freedom to file fiscal-year returns. 
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explains why the percentage for the Insurance carriers subgreup (line 64 of 
Table 9) is the lowest shown for any class. 

The 1949 ratio for the Manufacturing division, 40.2 per cent, is several points 
above that for all industrial groups combined, 36.0 per cent. Of the twenty 
groups within the Manufacturing division, the following thirteen have ratios 
above the over-all figure of 36.0: 


18. Apparel and products made from fabrics 

26. Leather and products 

17. Texti.e-mill products 

20. Furniture and fixtures 

36. Other manufacturing 

31. Electrical hinery and equip 

15. Food and kindred products 

19. Lumber and wood products; except furniture 

32. Transportation equipment, except motor vehicles 
33. Motor vehicles and equipment, except electrical 
23. Chemicals and allied products 

25. Rubber products 

30. Machinery, except transportation equipment and electrical 





SSSESSERESSSE 
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Thus, the majority of the groups in this division in 1949 filed a greater fraction 
of the total number of returns on a fiscal-year basis than the corresponding 
fraction (36.0 per cent) for the entire corporate system. The remaining seven 
groups had percentages below the over-all average, with the lowest, for Tobacco 
manufactures, at 23.8. 

In two groups, Apparel and Leather, more than half the 1949 returns were 
filed on a fiscal-year basis. Conceivably, the exceptionally high percentages for 
these two groups—-and perhaps also for certain other groups showing figures 
not far below 50 per cent—reflect strong seasonal influences which tend to 
encourage use of an accounting period other than the calendar year. Such 
seasonal influences may be incident to the nature of the raw material, the par- 
ticular manufacturing processes used, the selling customs of the business, or the 
seasonal pattern of the distributors or other purchasers of the products manu- 
factured. Contrariwise, one may suspect that the groups showing exceptionally 
low percentages—Tobacco, Stone, Printing, and Paper—are much less subject 
to seasonal influences, either in their own operations or in their relations with 
their customers. A-careful study of the specific seasonal variations involved, 
however, is likely to be obstructed by the fact, already noted, that a particular 
group in the industrial classification of corporations is not precisely limited to 
a narrow line of industry. All we can do at this stage is to suggest that seasonal 
factors may to an important degree account for the wide variation among the 
percentages for the twenty groups in the Manufacturing division. 

The 1949 ratio for the Trade division, 41.4 per cent, is also above the over-all 
figure of 36.0 per cent. This is true for each of the three groups within the 
division: Wholesale, Retail, and Trade not allocable (this last does not pertain 
to any identifiable branch of trade). The fact that the percentage for the Whole- 
sale group, 43.7, stands above that for the Retail group, 39.9, may appear 
somewhat surprising. I can offer no confident explanation for this, but venture 
a hesitant suggestion that those distribution lines for which seasonal factors 
are of minor importance bulk larger, in terms of number of corporations, in re- 
tailing than in wholesaling. Some part of the result may arise also from differ- 
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ences in specialization between the wholesale trade and the corresponding re- 
tail trade. For example, a typical retail food store might be expected to have 
only moderate seasonal variation in sales, and certain commodities handled by 
the store would presumably show little such variation. In the relevant wholesale 
trades, however, the situation might be very different because of specialization. 
A wholesale dealer in produce may have a very sharp seasonal factor—a much 
more intense seasonal factor than that affecting the food retailers who purchase 
from him. Finally, some lines of wholesale trade may have no significant coun- 
terpart in the retail trade, for example, jobbers and manufacturers’ agents 
handling machinery and industrial raw materials, and any other wholesalers 
dealing in commodities sold mainly only in wholesale lots. Insofar as such lines 
are affected by seasonal factors, they might contribute to the higher percentage 
fou. d above for the wholesale group. 

Among the nine subgroups within the Retail group, all but three have per- 
centages above the over-all figure for the entire corporate system, as follows: 

. Apparel and accessories 

. General merchandise 

. Other retail trade 

. Furniture and bouse furnishings 

. Eating and drinking places 

. Food 
The first of these subgroups, Apparel and accessories, has a higher percentage 
than the highest found for any group in the Manufacturing division—54.1 for 
the Apparel group—and higher in fact than that for any other class shown in 
Table 9. One can easily assume that the intense seasonal variation in the type 
of distribution activ:+v occurring in this subgroup is a major factor in explaining 
the very high figure. The same can probably be said of General merchandise, for 
Department stores, one of the largest subsubgroups, in terms of number of 
corporations, within that subgroup, may be subject to seasonal fluctuations of 
the same general nature as those affecting the apparel stores.'* One may, how- 
ever, question whether seasonal factors are an important cause of fiscal-year 
reporting for such subgroups as Eating and drinking places and Food, which 
have percentages only slightly above the over-all average. 

The figure for the Services division, 37.9, is not much above the over-all 
average. Within the division, however, the followirg three groups have per- 
centages well above that average: 

74. Motion pictures 41.8 
69. Hotels and other lodging places 41.2 
75. Amusement, except motion pictures 39.7 


In each of these lines, considerable seasonal factors may affect many of the 
corporations included. Significant seasonal factors may be present in two other 
groups: Automotive repair services (which includes filling stations), and Other 
services (which includes schools); the percentages for these two groups stand 





19 See data on number of returns for the subsubgroups in S. of I., 1949, pp. 76-77. 
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slightly above the over-all average, and considerably above the three groups 
(numbers 70, 71, and 73) not heretofore mentioned. 

10. Industrial differences in fiscal-year reporting: total assets. In the Statistics 
of Income table showing fiscal-year data for balance-sheet returns, the only 
figure shown for each line of industry, besides number of returns, is the account- 
ing item total assets. As total assets is a much better indicator of importance 
than mere number of returns, and is in fact the best general-purpose measure of 
importance, we examine now the portion of the aggregate total assets reported 
on all balance-sheet returns in each line of industry that was reported on fiscal- 
year returns. 

Although percentages for 1946 have been com puted and examined, only those 
for 1949 are presented here. The 1946-1949 change in the total-assets per- 
centage for most lines of industry showed advances, only ten lines showing de- 
clines, whereas the number-of-returns percentages advanced for all lines. But 
even when the percentages on both bases advanced, the advances usually dif- 
fered in degree. The manufacturing group Electrical machinery and equipment 
showed the following changes between 1946 and 1949: 


1946 1949 


In terms of number 34.0 42.0 
In terms of total assets 11.6 10.2 


In this case, average total assets reported per fiscal-year return declined, while 
it rose for other returns. In some other case, of course, the diversity of change 
in average total assets might not be so marked, but it would still lead to differ- 


ferences between the percentages.”° 
Our primary interest attaches, however, to differences in the percentage 
among classes; here also remarks are confined to the figures for 1949 (see 
Table 9, column ec). For the eight broad divisions, the 1949 percentages are: 
2. Agriculture 37.9 37. Public utilities 1. 
6. Mining 15.4 42. Trade «. 


12. Construction 33.9 57. Finance 
18. Manufacturing 23.3 68. Services 40. 





2 For the Electrical group, both number of returns and to'al assets increased from 1946 to 1949, for both fiscal- 
year and other returns. But the average total assets per return changed as follows (in thousands of dollars) : 


1946 1949 


Fiscal year returns 656 487 
Other returns 2,564 3,096 


The net increases in the number of returns arise from: newly c!-artered corporations which became active during the 
three year period, minus corporations which had been active in 1946 but were dissolved or became inactive by 1949, 
plus corporations which were shifted into this group by changed classification as to line of industry, minus corpora- 
tions similarly shifted out of the group, plus any net change arising from changes in the filing of consolidated returns. 
The net increases in total assets represent not only (1) changes in total assets reflecting the above changes in the 
corporations included in the group's list, but also (2) the net change between 1946 and 1949 of the aggregate total 
assets of those corporations which were in the list in both years (such change in total assets, for an identical corpora- 
tion, can be substantial, of course, over a three-year period and can be either an increase or a decrease). Even if no 
changes occurred in the list—if the lists of corporations were identical in 1946 and 1949—changes in the total assets 
of the various corporations in the list might have been such that the average total assets per return for fiscal-year 
corporations declined whereas that for other returns rose. In this case, if the number-of-returns percentage rose from 
1946 to 1949, the total-assets percentage would have risen less or might even have declined. 
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Here the variation around the over-all average of 13.2 for the entire corporate 
system is much wider than the variation of number-of-returns percentages. For 
two divisions, Public utilities and Finance, the figures run far below the over- 
all average. The Mining percentage is only moderately above the over-all 
figure. An almost negligible percentage of the totei assets in Public utilities is 
reported on fiscal-year returns, whereas in Trade nearly half is reported on 
fiseal-year returns. This wide variation can be found also among the group 
percentages in several divisions: the range is from 3.2 to 20.8 among the groups 
in Mining, from §.4 to 64.8 among the groups in Manufacturing, from 0.0 to 
37.4 among the groups in Finance, from 25.5 to 53.7 among the groups in 
Services; and the range is from 24.3 to 71.3 among the subgroups of the Retail 
group in Trade. 

The basic reason why these comparisons differ so strikingly from those in 
terms of number of returns is the contrast between fiscal-year returns and other 
returns in average total assets per return. We may illustrate by citing 1949 
figures for the groups Tobacco and Leather in the Manufacturing division and 
the subgroup Commission merchants in the Wholesale group under Trade. For ' 
Tobacco, the number-of-returns percentage is far above the total-assets per- 
centage, and average total assets is much smaller for fiscal-year than for other 
returns. For Leather, the number-of-returns percentage is sharply below the 
total-assets percentage, and average total assets is much higher for fiscal-year 
than for other returns. For Commission merchants, the two percentages are 
almost exactly equal, and average total assets is almost the same for fiscal-year 
and other returns. Such slight differences as do exist between the percentages 
and between the averages are in the same directions as found for Leather.” 

This matter of differences in average total assets per return between fiscal- 
year and other returns is tied up with the question of variations in fiscal-year 
reporting according to size of corporation. Detailed discussion of the matter is 
postponed until Part III, where the size variations are examined. There also 
some attempt is made to explain these facts in more meaningful terms than the 
mere citation of the arithmetical facts. For the purpose of the present discus- 
sion, we need merely note that these arithmetical facts do account for differ- 
ences between number-of-returns and total-assets percentages shown for 1949 
in Table 9. If this consideration is borne in mind, one may interpret differences 
among the various total-assets percentages of Table 9 along the same lines— 
with heavy emphasis on seasonal facters—used in interpreting corresponding 
differences among the number-of-returns percentages. Therefore, such inter- 
pretations are not now repeated. 

11. Industrial differences in fiscal-year reporting: mumber of income-tax re- 
turns. If industrial differences in the tendency to report on a fiscal-year basis 





% The supporting figures are as follows: 
Fiseal Year Percentage Average Total 
in Terms of: Assets (thous. $): 
Number of Total Fiscal-Year Other 

Returns Assets Returns Returns 
16. Tobacco manufactures 23.8 8.4 4,727.2 16 ,003 ..3 
26. Leather and products 50.7 64.8 607 .8 339.4 
44. Commission merchants 42.1 42.2 167.7 167.2 
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are considered in terms of the number of all income-tax returns—both balance- 
sheet and non-balance-sheet—in nearly all lines of industry the percentages are 
somewhat smaller than those for balance-sheet returns shown in column b of 
Table 9. Moreover, although the special balance-sheet tabulations of fiscal-year 
returns classified by line of industry are available only for the years 1946-1949, 
the corresponding tabulations for all returns not only exist for those years but 
can be compared—so far as number of returns is concerned—with a roughly 
corresponding industrial breakdown for 1939 and a considerably less detailed 
breakdown for 1934. 

The industrial classification used in Statistics of Income for 1934-1937 was 
much less detailed than that used for 1938 and later years. Furthermore, even 
though certain 1934 classes bear approximately identical names as those ap- 
plied to corresponding classes in 1938 and later years, exact comparability does 
not exist. The extensive revision in 1938, which attempted to put the Statistics 
of Income classification as nearly as possible in line with the Standard Industrial 
Classification of the Division of Statistical Standards of the Bureau of the 
Budget, involved certain shifts of minor industrial groups from one major 
group to another.” In t’:e main the “major groups”—the industrial divisions, 
groups, and subgroups such as are listed in Table 9—were not seriously dis- 
torted by these shifts; but some distortion did occur in certain cases, and com- 
parisons between 1934 and later years must accordingly be made with some 
reservations.” 

The fiscal-year percentages based on al] income-tax returns have been cal- 


culated and examined for the years 1934, 1939, 1946, and 1949, for each of the 
classes of Table 9 for which data are available, but the figures are net presented 
here. The classes as specified for 1939 are fairly closely comparable with those 
in 1946 and 1949, but, as indicated above, certain classes shown for 1934 are 
only roughly comparable with those of the three other years. Without excep- 





2 See S. of I., 1938, pp. 6-7 and 241-273. 

® The effect of these shifts on the major groups can be inferred from S. of I., 1938, pp. 223-228, in comparison 
with relevant figures on pp. 90-103 and pp. 104-116. For example, the Mining and quarrying division classified for 
1938 on the old basis (p. 223) and on the new basis (pp. 90 and 104) shows the following: 


Old Basis New Basis 
Returns with net income: 
Number of returns 4,470 3,391 
Net income (thous. $) 210,354 199 ,621 


Returns with no net income: 
Number of returns 8,699 7,561 
Deficit (thous. $) 161,041 152,440 


Corresponding figures for the Chemicals group in Manufacturing are: 


Old Basis New Basis 
Returns with net income: 
Number of returns 2,732 2,799 
Net income (thous. $) 336 ,390 339,112 


Returns with no net income: 
Number of returns 3,890 4,002 
Deficit (thous. $) 27 ,606 30,727 


% After the major revision of 1938 in the industrial classification, a small number of minor revisions were made 
in certain later years, especially in 1948. Particular attention is cailed to the changes of 1940—see S. of J., 1940, 
p. 310, and 8. of I., 1941, pp. 300-308. Changes in other years since 1938 are described in S. of I. as follows: 1939, 
none; 1941, pp. 300-303; 1942, pp. 7-8; 1943, p. 5; 1944-1947, none; 1948, pp. 425-450; 1949, p. 4; and 1950, none. 
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tion, the 1949 figure is higher than the 1946 figure, for every industrial class; 
and for some classes the 1946-1949 rise is very sharp. For every class for which 
the 1939 figure is available, except Forestry (4), the 1946 figure is above that 
for 1939. For every class for which the 1934 figure is available, the 1939 figure 
is above that for 1934. These findings emphatically support our general con- 
clusion that from 1934 to 1949 the extension in the practice of fiscal-year report- 
ing was very rapid and was widespread among all lines of industry. Differences 
among lines of industry, in any of the four years, can largely be explained in 
terms of seasonal factors and the role of regulatory authorities, as suggested 
above in Section 9, and such discussion is not repeated here. 

12. Industrial differences in fiscal-year reporting: net income or deficit. In 
Section 3 attention was called to serious defects in net income (or deficit) as a 
measure of importance in comparing fiscal-year returns with all returns. These 
defects are likely to distort not only the indicated change in the fiscal-year 
percentage from one year to another, but also comparisons of such percentages 
among industrial classes in any one year. Nevertheless, in column: d-f of 
Table 9 the 1949 percentage of net income (or deficit) shown on fiscal-year 
returns is presented for each industrial class and for each of three categories: 
returns showing net income, returns showing no net income, and the combina- 
tion of these two categories. The results are presented partly to provide the 
basis for pointing out and explaining some of the peculiarities in the way 
cyclical variations may influence not only different lines of industry but also 
fiscal-year returns as compared with other returns and partly because emphasis 
on net income, for the corporate system as a whole and for various lines of 
industry, may have particular significance for those specialists concerned with 
the tax implications of fiscal-year reporting. 

A first point to be noted is that, for any class, the ratio for the net category 
falls between those for the deficit and combined categories. We must bear in 
mind that 1949 was a year of moderate prosperity, despite the mild recession 
from 1948. The net income of net-income corporations exceeded, and for most 
lines of industry greatly exceeded, the deficit of no-net-income corporations. 
In almost every instance, the deficit category is a minor element in the class, 
and the net category makes up the bulk of the class. Hence, the deficit per- 
centage cannot be relied upon as fairly typical of the class as a whole, whereas, 
in most instances, the net percentage indicates the fiscal-year share for most 
corporations of the class. 

One might at once suggest that the combined percentage truly represents all 
corporations of the class and should therefore be taken as the measure of com- 
parative importance of fiscal-year reporting—among the industrial classes in 
1949—in terms of net income. But we should not forget that the combined net 
income—whether for fiscal-year returns or for all returns of any class—is a 
residual figure, obtained by subtracting the deficit of no-net-income corpora- 





® This is alg ically y in a year such as 1949 in which, for any one of the industrial classes, the deficit 
of ell uo-act-ineome corporations is eualler numerically than the net income of all net-income corporations—in 
other words, the combined category has a positive net inco (The situation would be altered, for many industrial 
classes, in a year of deep depression such as 1932.) 
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tions from the net income of net-income corporations. Such residual figures, 
when used to derive ratios, can yield somewhat weird results. An outstanding 
example of this appears in the 1949 result for the Nature-of-business-not- 
allocable division. Here the combined percentage is over 180, and this implies 
that the fiscal-year returns were, in terms of net income, 180 per cent of all 
returns of this class. This in turn implies that the non-fiscal-year returns were 
minus 80 per cent of all returns. Yet we know from Table 9 that about 70 per 
cent of the returns, by number, were non-fiscal-year returns, and that over 73 
per cent of total assets was reported on non-fiscal-year returns.” Although the 
180 per cent is numerically valid, and the inference that the fiscal-year returns 
show a combined net income 180 per cent of that for all returns is literally cor- 
rect, it seems somewhat ridiculous as an estimate of the importance of fiscal- 
year reporting in this class for 1949, unless we interpret importance in a very 
narrow sense. No other figure in the Combined column of Table 9 is so extreme 
as this case, but for various other classes shown in the table, the combined 
figure may be influenced to a lesser extent by the same type of circumstance. 
Any combined percentage which is “unduly” high or low may be suspect on 
these grounds, and even those which are not may nevertheless be misleading 
because of their status as ratios of residuals. 

This does not mean that these combined percentages have no significance, 
particularly in studying specific aspects of fiscal-year reporting; but it does 
mean that they are not a simple guide to appraising the importance of fiscal- 
year reporting, as between two years, or as among lines of industry. It also 
means that these combined ratios should not be expected to tally with corre- 
sponding ratios in terms of number of returns, or in terms of that more de- 
pendable measure of importance, total assets. 

Particularly in studying certain tax implications of fiscal-year reporting, the 
net percentages of net-income corporations may have significance. Correspond- 
ing figures for 1946 have been calculated but are not presented in tabular form. 
Nearly half of the classes show declines in the net percentage from 1946 to 1949, 
whereas the corresponding figures in terms of number of returns showed no 
declines.?” 

That the changes in net percentage from 1946 to 1949 can be different, for 
various classes, from those based upon number of returns can at least be ex- 
plained in numerical terms. Consider, for example, the Manufacturing division: 
the net percentage declines from 28.8 in 1946 to 21.9 in 1949. The appropriate 
number-of-returns percentages, which pertain only to the net category and are 





% The findings shown in Table 9 for this class are explained as follows (dollars in thousands): 


Fiscal- Per Cent 
All Year Fiscal 
Returns Returns Year 
Net income, net returns 9,420 5,621 59.7 
Deficit, no-net returns 7,586 2,210 30.5 
Net income, combined 1,834 3,311 180.5 


One may call it 2 mere “accident,” but the figures for deficit and for net income happen to be so related that the 
residual figure—net income, combined—is much larger for fiscal-year than for all returns. 

® Figures in columns a and b of Table 9 refer to both net and no-net corporations combined, whereas column d 
pertains only to the net-income category. 
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not those of Table 9, show a 1946-1949 rise from 31.7 to 39.1. How do we ac- 
count for this? The basic figures show that the number of fiscal-year returns 
rose, whereas the number of all returns declined; hence the number-of-returns 
percentage rose. The net income of fiscal-year returns declined, whereas the net 
income of all returns rose; hence the net-income percentage declined.** 

In the Manufacturing division, the average net income per fiscal-year re- 
turn declined, whereas that for all returns rose; and, of course, the average for 
the non-fiscal-year returns rose even more sharply. Why did the total net in- 
come on fiscal-year returns, as wel! as the average per return, decline, whereas 
corresponding figures for all returns—and still more so for other returns—rose? 
One important contributing cause was probably the fact that the mild indus- 
trial recession of 1949 had a specially heavy impact on various lines of manu- 
facturing industry which tend strongly to file fiscal-year returns, and less im- 
pact or none at all on those lines in which fisca!-year reporting is not very com- 
mon. Or, the peculiar adversities of 1946 may have hit with special force those 
manufacturing lines in which fiscal-year reporting was not very common. Even 
within any particular line of manufacturing, the impact of cyclical changes in 
business may vary greatly among corporations, and by chance those feeling the 
heavier impact might in the main be filing fiseal-year returns. The truth is that 
net income (or deficit) fluctuates widely and irregularly, and mere chance might 
result in a very different showing for fiscal-year returns than for other returns. 
In any case, I see no reason to suspect that fiscal-year reporting would, in any 
year chosen at random, be more (or less) likely to be practiced by corporations 
having high net incomes than by those having low net incomes or deficits. 

The Deficit percentage for the Manufacturing division rose from 31.6 in 1946 
to 39.5 in 1949, whereas the number-of-returns percentage rose only from 33.1 
to 38.6. The key averages (in thousands of dollars) are: 


1946 1949 

Average deficit per return: 
Fiscal-year ; 36.9 22.8 
All 38.6 22.4 


Very large increases occurred in the number of no-net-income returns and in 
the amount of deficit—for both fiscal-year and all returns. On the other hand, 
the average deficit per return declined, but less sharply for fiscal-year than for 
all returns; this explains why the Deficit percentage for the division rose more 
sharply than the number-of-returns percentage. A notable fact is that, for the 
non-fiscal-year returns, the aggregate deficit declined, despite a large increase 
in the number of returns. 





28 Actually a difference in direction of 1946-1949 movement between the number-of-returns and the net-income 
percentages could occur without the basic figures showing such drastically diverse movements as in the case of the 
Manufacturing division. In fact, the ratio of the 1949 net percentage to that for 1946 is derivable from the ratio of 
the 1949 number-of-returns percentage to that for 1946 by reference to 1949-1946 ratios of average net income per 
return, for fiscal-yea: and for all returns, by use of an algebraic formula. The key to the diverse 1944-1949 move- 
ments of the two percentages is thus the relation among the four average-net-income figures. These, in the Manu- 
facturing case, are (in thousands of dollars): 

1946 
Average net income per return: 
Fiscal-year 
All 
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The combined percentage for the Manufacturing division declined from 28.5 
to 20.5, whereas the number-of-returns percentage (Table 9) rose from 32.1 to 
38.9. The key averages (in thousands of dollars) are: 

1946 1949 
Average net income per return: 
Fiscal-year 106.1 64.0 
All 119.2 121.5 
Again, the difference in direction of movement between the net-income per- 
centage and the number-of-returns percentage can be explained by the relative 
changes in the average net income per fiscal-year return and for all returns.*® 

Although these percentages have little value in appraising the comparative 
importance of fiscal-year reporting, either as between two years or as among 
lines of industry, they may he e high significance for certain specific purposes. 
Thus, the net percentages may be very informing for the study of tax implica- 
tions of fiscal-year reporting (further attention is given to this point in Sec- 
tion 15). Likewise, the combined percentages may be highly useful in studies of 
cyclical fluctuations in corporate profits, whenever such studies must give care- 
ful attention to the differences in dating of corporate accounting years. 

13. Monthly distribution of accounting years by line of industry: number of 
returns. In Sections 5 and 6 the monthly distribution of accounting years for the 
entire corporate system, regardless of line of industry, was examined in terms 
of number of returns and of specified accounting items. In Section 7 these 
results were used to estimate the deviation of the center of the average year 
from July 1 for the entire list of returns whatever their specific accounting 
years. That analysis showed that, for the corporate system as a whole, even 
after the great extension in the use of fiscal-vear reporting which had developed 
by 1949, the center of the average year did not deviate more than a few days 
from July 1. But the dispersion of the various accounting years about their 
average was found to be large enough so that the average year might not be 
sufficiently typical for some purposes of interpretation. And I remarked that 
the deviation of the center of the average year from July 1 might prove much 
more considerable for some lines of industry than for the corporate system as a 
whole. We now examine evidence on this last point, bearing in mind that such 
evidence may be informative for certain other purposes also. 

The formidable task of carrying out a monthly analysis for every division, 
group, and subgroup of industry listed in Table 9 did not appear feasible. 
Moreover, for numerous groups and subgroups, fiscal-year reporting is of such 
small importance that no worthwhile results can be expected from a monthly 
analysis. Accordingly, the analysis actually carried out was confined to a 
limited list of industrial classes; of these a still smaller list is reported here, 





%® The various averages of net income or deficit shown for the net, deficit, and (to a smaller extent) combined 
categories in the text are not highly typical: any particular average does nc+ necessarily reflect the approximate sit- 
uation for a considerable fraction of ali corporations covered by the average. The reason is that the size distribution 
of net income (or deficit) tends to be extremely J-shaped, rather than bell-shaped: a very large number of corpora- 
tions have net income (or deficit) only slightly different from zero, whereas a very few corporations may show ex- 
tremely high figures. In these circumstances, the average is likely to fall at a point in the size scale near which only a 
moderate number of specific corporations fall. Nevertheless, for the purpose served by the averages in the analysis 
in the text, the averages are entirely valid. 
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determined as follows. Each of the eight broad divisions was included in the 
list: Agriculture, Mining and quarrying, Construction, Manufacturing, Public 
utilities, Trade, Finance, and Services, and also the Not-allocable division. 
Next, the 1949 figure for total assets, combining the net and no-net categories 
as tabulated in Statistics of Income, was found for each of the eleven fiscal years 
(July to November 1949, and January to June 1950) for each of the sixty-seven 
groups and subgroups covered by the 1949 tabulations. For each such group or 
subgroup the total assets for all balance-sheet returns, regardless of accounting 
period, was also determined, and 20 per cent of this total was taken as the 
critical figure. A particular group or subgroup showing in at least one of the 
fiscal-year months a total-assets figure above this critical figure qualified for 
the selected list. The five groups and subgroups thus selected were (with nu- 
merals as in Statistics of Income, 1949, pages 16-17): 


In the Manufacturing division: 
26. Leather and products 


In the Trade division: 
46. Retail group 
48. General merchandise 
49. Apparel and accessories 


In the Services division: 
74. Motion pictures 


Food stores, a subgroup in the Retail group, were also included because they 
yield a particularly high deviation figure. 

For each of the various divisions and of the selected groups and sub-groups, 
the monthly distribution can then be determined by the method used for 
Table 4, except that in the present cases no separate figures are available for 
part-year returns. The December figure shown in the tables of this and following 
sections includes therefore not only the calendar-year but also the part-year 
returns, but the part-year returns presumably account in any instance for only 
a small portion of the December figure. Monthly distributions can be worked 
out not only in terms of number of returns, as in the present section, but also 
in terms of those accounting items for which fiscal-year data are available in 
Statistics of Income, as in Sections 14-16. For any monthly distribution the 
deviation of the center of the average year from July 1 can be calculated by the 
method used for Table 8. It is stated in months (or fractions of a month), 
measured positively if the center of the average year follows July 1. 

The analysis in this section is confined to distributions in terms of number of 
returns, and covers all returns, whether or not accompanied by balance sheets, 
for each specified jndustrial class. While the main bodies of statistics analyzed 
in this and following sections are available only for the years 1946-1949, some 
roughly comparable statistics in terms of number of returns, particularly for 
the broad divisions, are available and have been examined for 1934 and 1939, 
but they are not reported here. Comparisons between 1946 and 1949 seem 
likely to be almost entirely free from distortion on account of changes in indus- 
trial classification.*® 

First examining each division in Table 10 separately, we find that in the 





% See Section 11, and especially footnotes 3-10. 
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interval 1946-1949 the figure for each fiscal-year month rises and the December 
figure declines. The great extension in the use of fiscal-year reporting in recent 
years not only affected all divisions, but it also had a fairly uniform impact 
upon all of the fiscal-year periods. In comparing the pattern of the monthly 


TABLE 10 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF NUMBER OF RE- 
TURNS IN EIGHT BROAD INDUSTRIAL DIVISIONS, 1946 AND 1949, AND 
DEVIATION OF CENTER OF AVERAGE YEAR FROM JULY 1 


Agriculture Mining and Quarrying 
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TABLE 10—(continued) 
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* Includes calendar-year and all part-year returns. 


distribution among divisions, attention may be confined to 1949 figures. The 
terminal months for the four fiscal years showing the two highest and two 
lowest percentages, for each division, are: 
Next to 
Highest Highest Lowest 


. griculture June March November 
Mining June March January 
Construction Mareh June November 
Manufacturing June March January 
Public utilities June September January 
Trade June January November 
Finance June September January 
Services June September January 


For all but one division, June is the highest month: fiscal years ending in June 
are more common than any of the other ten fiscal-year periods. For five di- 
visions, January is the lowest month, while November is lowest for the three 
other divisions. But the considerable scatter of the next-to-highest and next-to- 
lowest figures among the various months implies ‘ ‘at the details of the monthly 
pattern vary considerably among the divisions. “he predominance of June as 
the high terminal month, and of January as the low, should not: be allowed to 
obscure this basic variation among the divisions. It will be even more apparent 
when figures in terms of some other measure than number of returns are studied 
in later sections, and even in terms of number of returns more striking variation 
will be found among the industrial groups and subgroups. 

The length of time by which the center of the average year follows July 1 
increases from 1946 to 1949, for all divisions except Manufacturing, and Trade. 
‘The 1949 deviations range from 0.25 month for Finance to 0.69 month for 
Agriculture, or from about eight to about twenty-one days. Hence, even with 
fiscal-year reporting advanced to its 1949 stage, the center of the average year 
did not for any division fall more than twenty-one days after July 1; this devia- 
tion is, for most analytical purposes, probably negligible. 





® Some error may affect the reckoning of the center of the average year because of the way the part-year returns 
are treated. See Section 7 and also Appendix C. 
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We turn now to corresponding figures for the six groups and subgroups of the 
selected list, which appear in Table 11. Although frequent exceptions appear, 
study of each group and subgroup separately leads to the same broad conclu- 
sion as Table 10: the percentage for each of the eleven fiscal-year months rose 
from 1946 to 1949. Correspondingly, the figure for December declined in each 
of the six classes. As in Table 10, the percentages in this table indicate that the 
recent increase in fiscal-year reporting was widespread among various lines of 
industry and affected each of the fiscal-year periods for every class ex- 
amined. 

We can again get a rough picture of the monthly pattern of reporting for the 
various industries by listing the two highest and two lowest terminal months 
for each class in 1949: 

Next to Next to 
Highest Highest Lowest 
November June July 
June January May 
June March January 
January June October 


January July May 
August June November 


Two of the six classes have January and two have June as the high terminal 
month. For four of the six classes, November is the low terminal month. But 
again, examination of the next-to-highest and next-to-lowest terminal months 


reveals wide diversity; and this fact, coupled with the detailed differences re- 
vealed by the 1949 figures in Table 11, emphasizes the diversity in shape of the 
monthly pattern among the six classes. More emphatic evidence on this point 
will be set forth in later sections. 

Examination of the deviation-from-July 1 figures for each of the six classes 
shows, with two exceptions, a general tendency for the center of the average 
year to fall later with passing time. We may note a single instance of a negative 
deviation: for Motion pictures in 1946, the center of the average year fell 0.04 
month before July 1—about one day before July 1. The highest 1949 deviation 
is the 0.44 month and the highest 1946 deviation is the 0.55 month, both for 
Leather and products. These represent about 13 and 17 days, respectively. So 
far as these six classes are concerned, and with the center of the average year 
determined in terms of number of returns, we can conclude that the deviations 
from July 1 are probably negligible for most analytical purposes. 

Thus far in this section the numter-of-returns analysis has been with refer- 
ence to all returns, whether or not accompanied by balance sheets. When 
number-of-returns percentages are calculated for the balance-sheet returns 
alone, for nearly all industrial classes the percentage of the total number of 
returns filed as fiscal-year returns is somewhat higher.” This slight excess is, 
in most classes, fairly evenly distributed among the fiscal-year periods. Hence 
the shape of the monthly pattern, in terms of number of returns, is about the 
same in both cases, with a few slight disparities. 





™ Such calculations have been carried out for al! industrial divisions, and the six selected classes; the results, 
though not presented herein, provide the basis for certain remarks in the text. 
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TABLE 11 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF NUMBER OF RE- 
TURNS IN SIX SELECTED INDUSTRIAL CLASSES, 1946 AND 1949, AND 
DEVIATION OF CENTER OF AVERAGE YEAR FROM JULY 1 
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Apparel and Accessories (49) Motion Pictures (74) 
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Deviation from July 1 
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® Includes calendar-year and all part-year returns. 
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14. Monthly distribution of accounting years by line of industry: total assets. 
Balance-sheet fiscal-year returns are tabulated in Statistics of Income separately 
for each fiscal-year month, by industrial classes, only for the years 1946-1949, 
and the present analysis is confined to 1949. Although figures are tabulated for 
number of returns and total assets, the present section is concerned only with 
total assets—with the percentage distribution, for any industrial class, of the 
total assets of that class among the various accounting periods. As in the 
previous section, the accounting periods covered are the eleven fiscal-year pe- 
riods with terminal months July to November and January to June, and the 
December period, which again includes part years. 

Table 12 shows the results for the eight broad industrial divisions and the 
six selected classes. The two highest and two lowest 1949 percentages, among 
the fiscal-year months, for each division, can be identified from the first eight 
columns of Table 12 as follows: 

Next to Next to 

Highest Highest 

June March 

May August 

June March 

June October 

June August 

January June 


June September 
August June 


Comparison of these results with the corresponding figures in Table 10 shows 
that in not a single division are the four indicated months identical for the per- 
centages in terms of total assets and those in terms of number of returns. This 
and more detailed differences between the two tables may be ascribed to varia- 
tions in average total assets among the groups of corporations filing for various 
periods. 

The present list still shows June as predominantly the high month. For five 
divisions January is the low month, as was the case for Table i0; of these, four 
show February as next-to-lowest month. That January should in numerous 
cases be the low month, particularly in terms of number of returns, may reflect 
the influence of the accounting profession: efforts to reduce the peak auditing 
load incident to the huge number of calendar-year returns would probably steer 
away from a shift to a fiscal year ending in the following month when auditing 
work on such returns would overlap work on calendar-year returns. 

Nevertheless, a major reason for choosing a particular fiscal-year period 
probably lies in certain seasonal characteristics of the industry involved. The 
accounting requirements of public regulatory bodies operate in some industrial 
lines to preserve the practice of calendar-year reporting. But, wherever such 
requirements do not control, I believe that seasonal factors are mainly responsi- 
ble for the developing monthly pattern of reporting. 

The deviation-from-July-1 figures show wide differences among the divisions. 
For the Services division, the figure is negative: the center of the average year 
falls one day before July 1. The highest deviation is for Agriculture: here the 
0.71 month represents about twenty-two days. So far as the divisions are con- 
cerned, we conclude that the centers of the average years, in terms of total as- 
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TABLE 12 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF TOTAL ASSETS 
TABULATED FROM BALANCE SHEETS, FOR EACH BROAD DIVISION AND 
SELECTED CLASS, 1949, AND DEVIATION OF CENTER 
OF AVERAGE YEAR FROM JULY 1 


Agricul- Mining & Construc- Manufac- Public 
Year Ending ture Quarrying tion turing Utilities 


July 51 0.91 1.43 0 18 
August .B5 2.18 1.78 0.29 
September 2.70 0.18 
October 4.01 
November 
December*® 
January 
February 
March 
April 
May 

June 
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sets, depart from July 1 by time intervals which are probably negligible for 
most analytical purposes. 

The differences in the shape of the monthly pattern among the six selected 
industrial classes are much more striking than those among the divisions. In- 
stead of a corresponding schedule listing the four high and low months in each 
class, percentages are shown for the six classes, because here the numerical 
magnitudes are more informative as to differences in pattern. The high and low 
months are determined in terms of total assets from Table 12, and the per- 
centages are those for 1949 from Tables 11 and 12. 


High Number of Low Total Number of 
Montb Retuvns Assets Returns 

Leather and products November F d July 1.22 
Retail group January ° . 1.17 
Food stores February r d .38 
General merchandise January m October -12 
April -12 

Apparel stores January ‘ ‘ April .88 
Motion pictures August R - 46 


oO go ne eo es fe 
SeSsRessee 
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First disregarding the figures based on number of returns, we observe that 
the high and low months vary widely among the classes. The commonest high 
month is January, with three cases, all in retail trade; among the low months 
April and November each appear twice. In the retail trade group, General 
merchandise—in which department stores }»redominate—shows April tied with 
October for low place, and Apparel stores also shows April as low. This is more 
evidence of the striking role of seasonal factors: presumably because of the 
heavy Easter trade, these stores avoid a fiscal year ending at that time. The 
high percentages range up to a maximum of 64.30 fer General merchandise, and 
the low percentages range from 0.12 for General merchandise to 1.22 for Leather 
and products.” Further evidence on the great diversity in the high and low 
months and in the level of the high and low percentages can of course be ob- 
tained by comparing in detail the monthly percentages in Table 12. 

When we compare total-assets and number-of-returns percentages for the 
high and low months listed above for the six classes, we find that the total-assets 
percentage of the high month is higher—in several instances, very much higher 
—than the corresponding number-of-returns percentage. In each class the total- 
assets percentage of the low month is lower—in several instances, very much 
lewer—than the corresponding number-of-returns percentage. In a sense, this 
is not surprising, because the high and low months have been determined from 
the total-assets percentages, and high and low months determined from the 
number-of-returns percentages have been found in general not to be the same 
for each industrial class as these total-assets high and low months. Neverthe- 
less, if we examine Tables 11 and 12 separately for each class, we find in general 
that the total-assets high percentage is above the number-of-returns high per- 
centage (which frequently occurs in a different month), and that the reverse is 
true for the low percentages. The range between high and low is generally much 
greater for the total-assets than for the number-of-returns percentages. 

The explanation appears again to run in terms of average total assets. While 
for fiscal-year returns in general, the average total assets ordinarily falls below 
that for calendar-year returns, the average total assets for a particular fiscal- 
year period—which may turn out to be the high for the specific industrial class 
—may stand far above the general average. This may, in certain cases, be due 
to the fact that a few very large corporations of the class happen to choose that 
fiscal-year period. Similarly, for a particular fiscal-year period—which may 
turn out to be the low for the specific industrial class—the average total assets 
may fall even below the general average of that class for all fiscal-year returns. 
Further discussion on this point appears in Part IIT. 

The deviation-from-July-1 figures in Table 12 range from —0.96 month for 
Motion pictures to 0.92 month for Food stores—from about twenty-nine days 
before July 1 to about twenty-eight days after July 1. While these maximum 
deviations are stil] not very large, they are greater than those found in terms of 
number of returns. And a deviation of approximately one month may be large 





* The minimum of the high percentages is not cited because, as indicated in Section 13, all of these six classes 
except Food stores were selected on the criterion that the total-assets percentage for some fiscal-year month in 1949 
must be at least 20. Except for this limitation, we might have found numerous high percentages—for example, for 
certain groups or subgroups in the Public utilities or Finance division—muck below 20. 
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enough so that, for the industrial class involved, some allowance should be made 
in analyzing corporate data from Statistics of Income whenever the analysis 
relies heavily on the centering of the average year. I should remark also that 
the scatter of the figures about the center of the average year is in general 
greater for the specific industrial classes than for the entire corporate system, 
and greater in terms of total assets than in terms of number of returns. The 
average year is correspondingly less typical of the various accounting periods. 

15. Monthly distribution of accounting years by line of industry: net income 
of net-income corporations. For certain analytical purposes, particularly in fore- 
casting tax revenue and some other aspects of tax analysis, special attention 
attaches te the net income of corporations showing net income, as distinct from 
those showing deficits. The present section therefore describes the monthly 
pattern of reporting determined from figures on net income of corporations in 
the net category, for various lines of industry. Here, as in Sections 13 and 14, a 
primary purpose is to discover whether the patterns are such that the center of 
the average year may, for some industrial classes, depart sufficiently from July 1 
to render an assumption of July 1 centering seriously invalid. The figures are 
presented only for 1949, and the percentages for the net category of any in- 
dustrial class are obtained by dividing the net income for a particular reporting 
period by the total net income of the class. 

Table 13 shows the percentages for each broad division and for the six se- 
lected classes. Some tendency exists for the very high and the very low months 
to be about the same in Table 13 as in Table 12, for any division or class. One 
can infer that although net income is a less stable measure of comparative im- 
portance among the fiscal-year periods than total assets, some basis exists for 
expecting net income to yield high or low percentages in the same months as 
total assets. Net income, which fluctuates greatly from year to year, may per- 
haps be expected to have an average level over a period of years which would 
yield percentages in the main consistent with those for total assets. But, in view 
of the varying impact with which changes in net income may hit corporations 
filing for different fiscal-year periods, this tendency should be expected to have 
no great force in any particular year, such as 1949. Moreover, we should re- 
member that 1949 was a year of only mild recession from a high level of pros- 
perity—a year in which the great bulk of corporations fell in the net category— 
and this presumably means that the net-income distribution among accounting 
years was exceptionally representative of all corporations, net and no-net 
categories combined, in 1949.* 

The deviations of the center of the average year from July 1 show no wide 
variation among the eight divisions. The minimum is 0.03 month for Public 
utilities, and the maximum is 0.47 month for Agriculture—a range from about 
one day to about fifteen days after July 1. For 1949 we can therefore conclude 
that the net income figure for any division belongs to an average year with a 
center differing from July 1 by a time interval which is probably negligible. If 





™ These comparisons oi net-income percentages with those based ujyon number of returns and total assets might 
have been improved if each of the latter had been confined to the net cxtegory. But in view of the minor importance 
of the no-net category in 1949, these test comparisons were not made. 
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TABLE 13 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF NET INCOME OF 
NET INCOME CORPORATIONS, FOR EACH BROAD DIVISION 
AND SELECTED CLASS, 1949, AND DEVIATION OF 
CENTER OF AVERAGE YEAR FROM JULY 1 
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* Includes calendar-year and all part-year returns. 


our records included a year of deep depression, very different results for the 
deviation of the center of the average year based on net income might appear 
for the various divisions and classes. Conceivably we might then find a devia- 
tion several times as great, positively or negatively, as that shown in Table 13, 
for any particular division or class. And if fiscal-year reporting continues to 
develop, further changes in the monthly pattern might result. Hence, results 
for any particular year, such as 1949, cannot be trusted to indicate the maxi- 
mum deviations for some other year, reckoned in terms of net income. 

The deviations among the six selected classes shown in Table 13 range from 
—0.49 month for Motion pictures to 0.96 month for Food stores—from about 
fifteen days before July 1 to about thirty days after July 1. This range is much 
wider than that found for the broad divisions. In analyses of the class net-in- 
come figures, one may therefore not safely assume for the classes showing the 
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maximum (positive or negative) deviations in 1949 that the average accounting 
period centers on July 1. And for some other year, particularly a year of deep 
depression or of much higher prosperity than 1949, the deviation for one or 
more industrial classes might fall far outside the maximum range found for 
1949. In such instances, treating the average accounting period for the class as 
centering at July 1 of that year would be even less, and perhaps very much 
less, justifiable. 

To what extent considerations of this sort can be taken into account in esti- 
mating tax revenue or in other tax analyses, I do not attempt to state. But the 
foregoing analysis seems to indicate that any such investigations involving at- 
tention to industrial differences in net income should be carried forward, for 
any year, with the firm realization that these variations in the location of the 
center of the average accounting period introduce margins of error into the 
conclusions. In a year differing widely from 1949, the investigation might even 
need to attempt certain numerical adjustments for this departure of the center 
of the average accounting period from July 1. 

16. Monthly distribution of accounting periods by line of industry: net income 
of both categories combined. The net income (or deficit) of both net and no-net 
categories combined is worthy of attention, because various analysts of finan- 
ciai and business conditions rely upon Statistics of Income compilations of 
profits, for the entire corporate system and for various lines of industry, in ap- 
praising past and current developments. Does the increased use of fiscal-year 
reporting invalidate the assumption customarily made in such analyses that the 
profit figure pertains to a year centering at July 1? Evidence presented in this 
section for 1949 suggests an answer to this important question. The basic figures 
for the net income (in rare cases, the deficit) for both net and no-net categories 
combined relate to all returns, whether or not accompanied by balance sheets. 

Table 14 presents the results for each of the broad divisions and selected 
classes. Figures for the deviation from July 1 are not large in any division. The 
range is from 0.02 month for Public utilities to 0.43 month for Agriculture— 
from less than one day to about thirteen days after July 1. Hence the maximum 
departure from July 1 is small enough to warrant the conclusion that, for each of 
the divisions in 1949, the tabulated net income for all returns may probably be 
regarded as pertaining to a year centering at July 1 without serious error. For 
the ressons pointed out in Section 15, however, this conclusion cannot be de- 
pended upon for other years. 

For the six specially selected groups and subgroups a minus percentage ap- 
pears at some points—for example Leather and products, for the fiseal year end- 
ing September 1949. This merely means that, whereas the two categories com- 
bined showed a positive net income for the class as a whole, they showed a 
deficit for that particular fiscal-year period. Again, one should remember that 
the indicated monthly pattern might be very different for some other year. 
Changes in net income can have different impacts on groups of corporations, 
or on particular corporations, reporting for different fiscal-year periods, and 
thereby work year-to-year changes in the monthly pattern. But these different 
impacts can, and probably do, arise also from industrial differences among the 





FISCAL-YEAR REPORTING CORPORATE INCOME TAX 349 


TABLE 14 


PERCENTAGE DISTRIBUTION BY FILING PERIOD OF NET INCOME 
(OR DEFICIT) OF ALL CORPORATIONS, FOR EACH BROAD DIVISION 
AND SELECTED CLASS, 1949, AND DEVIATION OF CENTER 

OF AVERAGE YEAR FROM JULY 1 
Agricul- Mining & Construc- Manufac- Public 
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fiscal-year periods. Even for classes as narrowly defined as the six shown in the 
table, the aggregate figure of any class is made up of figures for various sub- 
classes. For example, the Leather and products class is made up of corporations 
tanning and finishing leather, those manufacturing leather footwear, and those 
manufacturing various other leather products. A possibility exists that the 
fiscal-year returns of one of these subclasses may pertain predominantly to a 
year ending in October, those of another to November, and those of others to 
other months. 

The figures for deviation of the center of the average year from July 1 range 
from — 0.49 month for Motion pictures to 1.00 month for Food stores— from 
about fifteen days before July 1 to thirty-one days after July 1. The second 
figure is one of the largest we have found thus far. A deviation of one full 
month, in the location of the center of the average year with reference to July 
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1, seems unlikely to be negligible for many purposes of careful analysis of profits 
data. For some other year, particularly one with cyclical conditions much more 
favorable or unfavorable than in 1949, the deviation for Food stores, or even 
for some of the classes with smaller deviations in 1949, might exceed one 
month by a wide margin. 

Many other classes besides the six shown in Tables 11-14 have been an- 
alyzed for 1946 (and in some cases 1939) and 1949, in terms of the various cri- 
teria used in Tables 10-14, though the results are not presented here. For one 
particularly informing case—the manufacturing group, Transportation equip- 
ment—the deviation of the center of the average year, in terms of net income 
of the combined categories, is commented on here. The principal subclasses in 
this class are: railroad equipment, aircraft and parts, and ship- and boat-build- 
ing. We must bear in mind that the postwar adjustments of 1945-1946 hit in- 
dustries of this class with great force, and the impact was especially severe 
on the aircraft manufacturers.“ The class as a whole in 1946 showed net 
income in the net category of $175 million, and deficit in the no-net category 
of $188 million; and for the aircraft subclass the comparable figures were $38 
million and $156 million. 

The deviation of the center of the average year in 1946 is —2.97 months: 
the center of the average year is almost three full months before July 1. Surely 
this deviation is not negligible; surely one cannot assume that July 1 is the 
center of the average year in this case. This example clearly indicates what 
extreme distortions can be produced in the monthly pattern, and in the position 


of the center of the average year, by a wide cyclical upheaval or other violent 
factors affecting profit realization. 


PART III. DIFFERENCES ACCORDING TO SIZE OF CORPORATION 


17. Average total assets per return: industrial classes. At various points in 
preceding sections, attention has been called to the possible effects of size dif- 
ferences among corporations, particularly between fiscal-year and non-fiscal- 
year corporations, upon the percentage ratios and patterns under study. In 
this and the following section, more direct attention is given to this aspect 
of the problem; the present section is concerned with the differences in size 
between fiscal-year and other returns, for the various lines of industry. 

Despite the accounting and other factors which may affect it for any particu- 
lar corporation, total assets appears to be unmistakably the best measure of 
size for comparisons among corporations or groups of corporations. The 
simplest comparison of size between two groups of corporations is in terms of 
average total assets per corporation of each group. Unfortunately, average 
total assets for a group of corporations—for example, those of an industrial 
class—is not highly typical of the various corporations included in the group. 
This is because of the peculiar shape of the size distribution, in terms of total 
assets, among the corporations of the group. This shape is marked by an 
enormous concentration of corporations at the low end of the size scale, with 





® 8. of I., 1946, pp. 98-09. 
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the number of corporations within an interval cn the size scale diminishing 
steadily as size increases, and with a very few extremely large corporations 
at the high end of the size scale.* For such a distribution, the average of total 
assets is likely to lie in a size interval which includes only a fairly small portion 
of the total number of corporations: the average is not typical because relatively 
few actual cases lie near the average. Despite this limitation on the significance 
of average total assets, comparisons of such averages yield highly useful in- 
formation about broad differences in size among groups of corporations. 
Table 15 presents average total assets in 1949, by industrial class and for all 
classes combined, for all balance-sheet returns, all fiscal-year returns, and all 
non-fiscal-year returns. The first column shows enormous differences in average 


TABLE 15 


AVERAGE TOTAL ASSETS PER RETURN FOR ALL RETURNS, FISCAL- 
YEAR RETURNS, AND NON-FISCAL-YEAR RETURNS, BY INDUSTRIES, 
AND RATIO OF FISCAL-YEAR TO NON-FISCAL- 

YEAR AVERAGE, 1949 


(dollars in thousands) 


All 
Returns 


1. All industrial groups 


2. Agriculture, forestry and fishery 
Farms and agricultural services 
Forestry 
Fishery 
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. Mining and quarrying 
Metal mining 
Anthracite mining 
Bituminous coal and lignite mining 
Crude petroleum and natural gas production 
Nonmetallic mining and quarrying 
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. Manufacturing 1,122 
Beverages 1,032 
Food and kindred products 973 
Tobacco manufactures 13 ,324 
Textile-mil! products 1,270 
Apparel and products made from fabrics 203 
Lumber and wood products, except furniture 535 
Furniture and fixtures 289 
Paper and allied products 1,807 
Printing, publishing, and allied industries 377 
Chemicals and allied products 1,603 
Petroleum and coal products 28 ,968 2,088 
Rubber products ~ 3,103 2,238 
Leather and products " 475 608 
Stone, clay, and glass products, 813 364 
Fabricated metal industries 4,061 1,198 
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* Such a size distribution, technically described as J-shaped, is illustrated in Appendix D. The figures shown 
there are for 1934, but although the shape in some other year, such as 1949, may be different in matters of detail, the 
main characteristics of the shape appear in all years. 
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TABLE 15—(continued) 


Fiscal- Non-Fiscal- 
All Year Year 
Returns Returns Returns 
(a) (b) 


Fabricated metal products, except ordnance, 
machinery, and transportation equipment 

Machinery, except transportation equipment 
and electrical 

Electrical machinery and equipment 

Transportation equipment, except motor ve- 
hicles 

Motor vehicles and equipment, except electrical 

Ordnance and accessories 

Scientific instruments 

Other manufacturing 
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. Trade 

Wholesale 
Commission merchants 
Other wholesalers 

Retail 
Food 
General merchandise 
Apparel and accessories 
Furniture and house furnishings 
Automotive dealers and filling stations 
Drug stores 
Eating and drinking places 
Building materials and hardware 
Other retail trade 

Trade not allocable 


. Finance, insurance, real estate, and lessors of real 
property 
Finance 
59. Banks and trust companies 
60. Credit agencies other than barks 
61, Holding and other investment companies 
62. Security and commodity-exchange brokers 
and dealers 
63. Insurance carriers and agents 
64. Insurance carriers 
65. Insurance agents and brokers 
66. Real estate, except lessors of real property other 
than buiidings 
67. Lessors of real property, except buildings 
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68. Services 

69. Hotels and other lodging places 

70. Personal services 

71. Business services 

72. Automotive repair services and garages 

73. Miscellaneous repair services, hand trades 
74. Motion pictures 

75. Amusement, except motion pictures 

76. Other services, including schools 


77. Nature of business not allocable 
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total assets per return among the various lines of industry. The range is from 
about $50,000, for Eating and drinking places and Misceilaneous repair serv- 
ices, to about $30 million for Petroleum and coal products and Insurance car- 
riers. This huge range, and the differences within that range, have no direct 
bearing upon our present analysis; but they do indicate that the average total 
assets for any inclusive group—for example, all industrial classes corabined, or 
even some of the broad industrial divisions—is the result of combining widely 
diverse figures. Hence, in the present inquiry, much may be gained by giving 
chief attention to narrowly defined classes, where the figures are probably less 
seriously affected by such diversity. 

The differences between average total assets for fiscal-year returns and for 
other returns range widely among the v:.rious industrial classes, as shown in 
comumns a and b. For the entire corporate system (all divisions combined), 
and for each of the divisions except Trade and Services, the fiscal-year average 
is smaller than the non-fisval-year average; and in some divisions, such as 
Public utilities and Finance, it is strikingly smaller. The fiscal-year average 
is also smaller than the non-fiscal-year average for the great majority of the 
groups and subgroups in the corporate system. Only in the Trade and Services 
divisions are the fiscal-year averages frequently larger. 

In appraising these differences among lines cf industry, the ratio of the 
fiscal-year to the non-fiscal-year average can be used effectively instead of 
the absolute difference between the two averages in each class. These ratios, 
in percentage form, are shown in the right-hand column of the table. The 
percentages range from 1 for Insurance carriers to 293 for General merchandise: 
in the former case the fiscal-year corporations have an average size only about 
1 per cent of that for other returns, whereas in the latter case the fiscal-year 
corporations are on the average nearly three times as large as other corpora- 
tions. These differences between the average total assets for fiscal-year returns 
and for other returns necessarily account for the differences between the 
fiscal-year percentages in terms of number of returns and those in terms of 
total assets (Table 9).*7 

One may ask why in some lines of industry the larger corporations tend on 
the average to report on a calendar-year basis, whereas in others they are 
more likely to file fiscal-year returns. In the extreme case of the Insurance 
carriers, the reason is clear: the great bulk of the corporations in this line, 
many of which are very large, are required by public regulation to compile 
calendar-year statements. The few corporations which escape these require- 
ments and file on a fiscal-year basis are very small. In the extreme case in the 
other direction, General merchandise, a larger number of huge department 





* Actually, the relation between those percentages can best be set forth in terms of another ratio, not shown in 
Table 15. In fact, the following equation holds for any industrial class or combination of classes: 


Number-of-returns percentage average total assets for all returns 





Total-assets percentage average total assets for fiscal-year returns 


For all industrial classes combined, Table 9 gives the two percentages as 36 and 13.2 and Table 15 gives the average- 
total-assets figures as 980 and 359. The ratio of each of these two pairs of figures is 2.7. 
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stores (a major subclass within this class) file on a fiscal-year basis, mr ny of 
them in fact for a fiscal-year ending in January. These great enterprise: | om- 
inate the average total assets for fiscal-year returns. Possibly just as : mple 
explanations of the observed differences can be found for various other in- 
dustrial classes, but I do not here attempt to point them out. Instead, a more 
intensive study of variations in size among the fiscal-year returns themselves 
is more helpful. 

18. Average total assets per return: separately by accounting periods. For any 
industrial class, or combination of classes, the average total assets per return 
can be calculated for the returns filed for each of the twelve accounting periods. 
The details of such calculations, for all industrial classes combined in 1949, are 
shown in Tabie 16. Three of the averages—those for all returns, all fiscal-year 
returns, and other returns—are from the first line of Table 15. 


TABLE 16 


NUMBER OF BALANCE-SHEET RETURNS, TOTAL ASSETS, AND AVERAGE 
TOTAL ASSETS PER RETURN, SEPARATELY FOR ALL RETURNS, ALL 
FISCAL-YEAR RETURNS, AND ALL FILING PERIODS, 1949 


(dollars in thousands) 


Total Assets 


Number of Average Per 
Returns Return 
All returns 554,573 543 ,561 ,671 980 
All fiscal-year returns 199 ,912 71,691,136 359 
Year ending: 
July 14,423 4,768,979 331 
August 15,541 5,607 ,519 361 
September 21,958 7,498,518 341 
October 16,151 7,946 312 492 
November 12,446 6,540 ,967 526 
December® 354,661 471,870,535 1,330 
January 16 ,088 8,471,526 527 
February 13 ,448 3,453 ,270 257 
March 22,343 5,902 ,631 264 
April 17,127 4,686 ,479 274 
May 15,863 4,265,161 269 
June 34,524 12,549,774 364 


® Includes calendar-year and all part-year returns. 


Our main interest is in the averages for the eleven fiscal-year periods. These 
range from $257,000 for February to $526,000 for November. Although one 
of these figures is about double the other, the range is not strikingly wide; 
even the highest of the fiscal-year averages remains far below the over-all 
average of $980,000 for the entire corporate system. Not only is the average 
of total assets for all fiscal-year returns combined below the general average 
of the system, and strikingly below the average for other returns, but it runs 
low for each fiscal-year period separately. Regardless of the terminal month of 
the fiscal-year period, fiscal-year corporations run, on the average, much 
smaller than other corporations. 

But the entire system is made up of many lines of industry, and the findings 
for the system as a whole may conceal much more notable variations among 
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accounting periods in particular industrial classes. In order to examine this 
point, averages derived by the method of Table 16 were calculated for certain 
selected classes. These classes were selected from those showing especially high 
or low ratios in the final column of Table 15. The results, summarized here, 
show astonishing differences in average corporation size among the eleven 
fiscal-year periods, for each of these classes. The most striking vase is Rubber 
products, for which the average ranges from $89,000 for the terminal month 
May to $16,808,000 for October. In other words, in this industry corporations 
filing for October were on the average nearly 200 times as large as corporations 
filing for May. 

One may again ask why these conditions prevail, why an exceptional number 
of the large corporations of this industry choose to report for a fiscal year end- 
ing in October, and why the next choice—apart from the calendar year—seems 
to be February (averaging $3,139,000). Without a fairly close knowledge of 
the industry, I cannot give a positive explanation, but, as indicated in more 
general terms earlier, I believe that it is to be found in seasonal factors affecting 
the industry. These factors may relate to the raw materials of the industry, 
to its manufacturing process, or to conditions in the business of its principal 
customers. The two peaks in this case, a major peak in October and a minor 
peak in February, suggest that the Rubber group as a whole consists of at 
least two major subclasses, one of them with a seasonal slack in or shortly after 
October and the other in or shortly after February. 

The other cases show ranges which, had we not seen the Rubber figures, 
would be regarded as striking. They are (in thousands of dollars) : 


Month Highest Month 


Leather and products January 1,391 November 
Transportation equipment March 9,183 November 
General merchandise April 2,980 January 
Motion pictures April 1,475 August 


Again, seasonal considerations are probably a major element in the explana- 
tion of these conditions. 

Why do not seasonal factors affect small corporations to the same degree as 
large? The explanation may often be the differences in form of business activity 
between very large corporations and small corporations within a specified 
industrial class. Thus, a large metropolitan department store is essentially 
a different type of business from the general run of smaller enterprises which 
nevertheless are included in General merchandise, and seasonal factors which 
affect the large department store may be different, or strike with different 
force. I suspect a basic difficulty is that, even with classes as narrowly defined 
as the groups and subgroups listed in Table 15, we may yet have within any 
one class a considerable variety of subclasses affected by various types of sea- 
sonal factor. And, of course, large corporations may be more prevalent in one 
of these subclasses than in another.** 





8 These remarks concerning the complexity remaining within a narrowly defined class can be extended far be 
yond the seasonal problem and would then have a bearing, of course, on a wide range of analyses using Statistics of 
Income figures classified by industry. 
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19. Size in terms of average total assets in 19384. The only Statistics of Income 
tabulation of fiscal-year returns classifying returns by size of total assets which 
I have found is that for 1934, in which the size classification is given also for 
each industrial division and group.** For each industrial class and for all classes 
combined, that tabulation shows the number of fiscal-year returns for each 
size class, separately for the net and the no-net category of returns. Correspond- 
ing figures for all balance-sheet returns, regardless of accounting period, appear 
in one of the customary tables of Statistics of Income. The fiscal-year tabulation 
gives no other item, such as total assets, besides the number of returns, and a 
task of Appendix D is to estimate the relevant total-assets figures. 

Despite the facts that industrial classifications have so greatly changed 
since 1934 that very few of the 1934 classes are closely comparable with any 
in the 1949 list, that an enormous but by no means uniform increase in fiscal- 
year reporting has occurred since 1934, and that various other changes in 
corporate affairs have taken place which might also impair the 1934 figures 
as an indication of more recent conditions, I nevertheless present a summary 
of facts concerning the situation for all industries combined in 1934. This sum- 
mary will show certain relationships which may be roughly true for more 
recent years, and may also suggest how a more up-to-date tabulation of fiscal- 
year returns by size classes might help answer. some of the questions raised in 
previous sections of this analysis. 

The ratios of fiscal-year returns to balance-sheet returns for a] corporations 
are shown as percentages, separately for each assets size class, in Table 17. 
One relationship which holds fairly uniformly among the various size classes is 
that the fiscal-year percentage runs lower for the no-net category than for the 
net category. I can see no clear explanation for this, but suggest very tenta- 
tively that the general extension in the use of fiscal-year reporting—already 
well in progress by 1934, as we saw in Section 2—may in that year have affected 
corporations with a taxable income to a greater extent than those showing a 
deficit. In other words, corporations with a net income may in that year have 
had a tax incentive for shifting to a fiscal-year basis. Or, such influence as the 
auditing profession exerted in favor of fiscal-year reporting may have been 
particularly effective with corporations showing net income. Whatever the 
cause or causes in 1934, we cannot be sure that the same causes were at work 
in other years, such as 1949; I see no reason for confidence that the same re- 
lationship between the net and no-net percentages would be found if we had a 
corresponding size-class tabulation for 1949. 

A second important relationship to be noted is that for both the net and 
no-net categories, the higher percentages appear in the smaller size classes. 
This second relationship seems to me likely to persist at least roughly in other 
years, such as 1949, but this cannot be confirmed without a size classification 
for that year. 

20. Size in terms of net income or deficit. Unfortunately, a size classification 





% S. of I., 1934, pp. 205-207. The same issue also gives, in the preceding pages, a size classification in terms of 
net income (or deficit), by divisions and groups. But, except for the figures for all divisions conbined which are dis- 
cussed in Section 20, I do not examine these size classifications in terms of income. 
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of fiscal-year returns on the basis of total assets is available only for 1934. Net 
income (or deficit) is a much less satisfactory measure, but a size classification 
of fiscal-year returns on this basis is available for each year covered in the 
present analysis. For the one year 1934 such a size classification is given 
separately for each industrial division and manufacturing group.*® Correspond- 


TABLE 17 


NUMBER OF FISCAL-YEAR BALANCE-SHEET RETURNS AS A PERCENTAGE 
OF BALANCE-SHEET RETURNS WITH NET INCOME AND NO 
NET INCOME, BY ASSETS SIZE CLASS, 1934 


Lower Limit of 
Size Class Net 
(unit: $1,000) 


0 
50 
100 


1, 
5, 
10. 
50, 


All 


ing tables in Statistics of Income classify all returns, regardless of accounting 
period, on the same size scale of net income and deficit. We can therefore calcu- 
late, for any one of the twenty-three years, 1928-1950, the percentage ratio 
of the number of fiscal-year returns to the number of all returns in each size 
class, separately for the net-income and no-net-income categories. 

In spite of the steady increase in the importance of fiscal-year reporting 
with passing time, affecting the ratios for nearly all size classes, the year-to- 
year changes in these size-class ratios do not appear sufficiently significant 
to warrant detailed examination of the results for each year. Hence, in Table 
18 the presentation of the percentages is limited to selected years at five-year 
intervals. The most striking indication of the table is that for every year, in 
both categories, the percentages for the very small and the very large income 
size classes are below the figure for all classes combined, whereas percentages 
for the middle range of classes run moderately above the over-all figure. 

This can be made more specific by the following summary, which gives for 
each year, separately for the two categories, the size range within which the 
percentage exceeds without exception the all-classes figure. Ranges are given 
from the bottom of the smallest size class to the top of the largest size class 
included (in thousands of dollars) : 





«© 8. of I., 1934, pp. 200-204 shows the fiscal-year returns thus classified. Unfortunately, that issue of S. of I: 
does not give also a similar size classification of all returns, separately by industrial class, and therefore a percentage 
analysis by size class cannot be developed for specific industrial classes even for 1934. 
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Category Net No Net 


1929 3~1,000 2- 

1934 1- 500 1-250 
1939 1-1,000 1-250 
1944 4-5 ,000 1-500 
1949 2- 25 2-500 


This middle range begins in all cases at a very low level of income (or deficit), 
and, except for the no-net category in 1929, stops at a level of income which, 


TABLE 18 


NUMBER OF FISCAL-YEAR RETURNS AS A PERCENTAGE OF ALL RETURNS 
WITH NET INCOME AND NO NET INCOME, BY INCOME SIZE 
CLASS, SELECTED YEARS, 1929-1949 


Lower Limit of 
Size Class 1929 
(unit: $1 ,000) No Net 
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All classes 11.8 12.2 ‘ , 20.0 17.8 


* Not available separately for 1929 or 1934. In those years, the percentage in the $5,000,000 class applies to all 
returns showing net income (or deficit) of $5,000,000 or greater. 
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while it reaches $5 million for the net category in 1944, in general excludes a 
long stretch at the high end of the size scale. 

The main conclusions are apparent. For corporations with exceedingly small 
net incomes or deficits—seldom ranging above $2,000—fiscal-year returns are 
less common than on the average for all size classes. For the middle range on 
the size scale, which includes net incomes or deficits which are still small or 
only moderately large, fiscal-year reporting is more common than on the 
average. For very large net incomes or deficits, with the sing}e exception in the 
no-net category for 1929, fiscal-year reporting is less common—in some in- 
stances, much less common—than on the average. 

Can these conclusions be restated in terms of size of corporation, basing size 

on & more appropriate criterion than net income or deficit? Statistics of Income 
presented for 1948 a special tabulation, correlating size of net income (or 
deficit) with size of total assets, for all balance-sheet returns.“ While these 1948 
figures do not pertain to any year shown in Table 18, the tentative opinion 
may be ventured that the main relationships found for 1948 would probably 
hold for another year, particularly a year in which general economic and profit- 
and-loss conditions did not differ markedly from 1948. 
& Without attempting an elaborate analysis of these 1948 correlation tables, 
we can point out certain broad relationships indicated by the two tables—one 
each for the net and no-net categories—pertaining to all divisions combined. 
First, a very large corporation can show an exceedingly small net income or 
deficit. The following summary from the Statistics of Income tables gives the 
aumber of very large corporations in terms of assets which are in the very small 
net-income (or deficit) size classes (dollars in thousands) : 

Lower Limit Lower Limit of Assets-Sise Class. 


of Income 10,000 50 ,000 
(or Deficit) Net No Net Net No Net 


1 1 


1 

1 
3,482 494 30 584 
3,507 « 497 82 586 15 


Clearly, some very large corporations do have very small net incomes or 
deficits, but their number is a very minor fraction of the total, and an even 
smaller fraction for the net than for the no-net category. These numbers are 
also an almost negligible fraction of the total number of corporations of all 
assets-size classes having small net incomes or deficits: thus the seven cases 
in the net category above with net incomes under $1,000 are among a total 
of 63,626 in that income-size class; and the twenty-five cases in the no-net 
category above with deficits under $1,000 are among a total of 67,676 in that 
deficit-size class. 





4 

“ 8. of I., 1948, pp. 14-27, where the correlation tables are shown for all divisions combined, and separately for 
each broad industrial division. Similar tables were presented also for certain earlier years: 1936, pp. 42-43 and 167- 
183; 1937, pp. 188-205. The tables for 1936 and 1937 include also figures for the separate groups within the Manu- 
facturing division. 
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In a similar manner, we might summarize the figures for the smallest assets- 
size classes, to show whether any appreciable number of such corporations can 
show very large net income or deficit. For the two smallest assets-size classes— 
assets under $50,000, and assets between $50,000 and $100,000—we find no 
cases with net income or deficit above $5 million, but a very few cases with 
net income or deficit above $250,000. But again the number of such cases is a 
very small fraction of the total within the assets-size class, and likewise of the 
total within the relevant income-size class. 

Passing now to less extreme evidences of variation, we find for the net 
category indication of fairly clear correlation between assets size and size of 
net income, with, however, a considerable variation or scatter about the line 
(or curve) representing the correlation tendency. The case is much less clearly 
established for the no-net category, which in fact does not indicate at all 
clearly a positive correlation between size of assets and size of deficit. For the 
net category, the following tabulation shows the lower limit of ten assets-size 
classes which is commonest for each income-size class (dollars in thousands).@ 
The tabulation also gives the percentage of the total number of returns in the 
income-size class falling in the specified assets-size class. 

Income Assets Per Cent 
Class Class Concentration 
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For all income-size classes below $10,000, the commonest assets-size class is 
the smallest—with assets under $50,000—and in each of these six income-size 
classes, the concentration in the specified assets-size class is fairly high, ranging 
from 32 to 76 per cent. As we go on up the size scale of income, the specified 
assets-size class rises, and in general its lower boundary is considerably higher 
than that of the relevant income-size class. The variations among the concen- 
tration percentages are somewhat bewildering, and arise in part from variations 
in width of the assets-size classes. The main implication of these percentages, 
however, is derived not from their variations but from their general level: With 





* In determining this “commonest” assets-size class, a rough allowance is made for the varying widths of the size 
intervals. The assets-sise class is determined within which the peak of the curve distributing all the corporations in 
any specified income-size class, according to size of assets, probably falls. I may remark that a corresponding deter- 
mination for the no-net category would show assets under $50,000 as commonest for all income-size classes up to $1 
million. 
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few exceptions, at least one-third of the corporations in ar income-size class 
are concentrated in the commonest assets-size class. 

The conclusion seems justified that the findings from Table 18 can be ex- 
pressed in terms of a size classification based on total assets. Although the very 
smallest and very largest corporations practice fiscal-year reporting consider- 
ably less than the average, the many medium-size and moderately large-size 
corporations practice it more than the average. 

With regard to the finding for the very large corporations, some part of the 
low level of the percentages in Table 18 can perhaps be explained by the effect 
of corporations in the Public utilities and Finance divisions. The figures in 
that table are for all divisions combined, and we have already observed that 
fiscal-year reporting is for special reasons comparatively infrequent for many 
of the large corporations in those two divisions. A conclusive test of this point 
is not feasible, but the special income-size classification by industrial divisions 
for 1934, already cited, yields some indirect evidence on the point. That classifi- 
cation shows that a much larger fraction of the total number of fiscal-year 
returns (all divisions combined) of the smaller income-size classes fall in these 
two divisions than is the case for the very large income-size classes. This is 
not conclusive; a sure finding would be possible only if a corresponding size 
classification by industrial divisions of all returns were also available for 1934. 
But the tentative inference, from the figures actually available, is that fiscal- 
year reporting is comparatively more common for small net incomes or deficits, 


and comparatively less common for large net incomes or deficits, in the Public 
utilities and Finance divisions than in other divisions. On the other hand, a 
disproportionately high fraction of all large corporations is made up of large 
corporations in these two divisions.“ These two facts together may account 
for the low percentages for the large income-sizes ciasses in Table 18. 





© In 1934, about 31 per cent of all balance-sheet returns, regardless of size, were in the Public utilities and Fi- 
nance divisions; but of the retu-ns with aasets of $50 million and over each, about 76 per cent were in these two 


divisions. S. of I., 1934, pp. 72-04. ; 
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APPENDIX A 


REFERENCES TO Statistics of Income 
TABULATIONS FROM FISCAL-YEAR RETURNS 


Beginning with the issue for 1926, special tabulations from fiscal-year returns of cor- 
porations have been published in successive annual issues of 3tatistics of Income.“ I give 
below the page references for such tabulations and the relevant textual comment, as I 
have discovered them in the successive issues. Page references are to issues of Statistics 
of Income bearing the same date as “taxable year.” 


Taxable Pages 
Year 
1926 (includes also separate tables for 1925) 21-30 
1927 21-22 
384-399 
1928 35-37 
408-423 
1929 29-30 
351-064 
1930 31-32 
294-297 
1931 35-36 
1932 36 
1933 35-36 
1934 Part 2 32-33 
196-207 
1935 Part 2 19-20 
1936 Part 2 34-37 
1937 Part 2 29-30 
1938 Part 2 43-46 
1939 Part 2 34-42 
1940 Part 2 15-16 
1941 Part 2 15-16 
1942 Part 2 19-20 
1943 Part 2 18-19 
1944 Part 2 18-19 
1945 Part 2 38-39 
1946 Part 2 36-54 
1947 Part 2 16-42 
1948 Part 2 30-48 
1949 Part 2 15-32 
i950 Part 2 21-22 
APPENDIX B 


EFFECTS ON THE INDICATED HISTORICAL 
RECORD OF VARIATIONS IN NET INCOME 


The possibility that the inclusion of fiscal-year returns with calendar-year returns in the 
tabulations of Statistics of Income might impair the precision of the resulting aggregates 
for certain types of analysis and interpretation has been indicated at various points in this 
report. Two important kinds of possible impairment seem worthy of attention: the dislo- 
cation of the center of the average year from July 1, and the distortion of the shape of the 
historical record. The first kind was examined in Section 7 and at various points in Part 





«# Published annually by the U. 8. Treasury, after a delay of two or more years following the completion of the 
indicated taxable year. For example: Statistics of Income for 1949, Part 2, Treasury Department, 1953. 
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II; the second kind, the measurement of which is much more elusive, receives limited 
attention in this appendix. 

The analysis is confined to the net income of the net-income and deficit categories 
combined, without attention to any classification by size or line of industry; but the same 
methods are applicable to the various classes and to other income-account items and, with 
appropriate qualifications, to various balance-sheet items. Since we have no specific 
knowledge concerning the length and timing of their accounting periods (see Appendix C), 
the part-year returns are entirely ignored in the discussion, and their net-income figures 
are excluded from all aggregate net-income figures below. The figures considered are only 
those pertaining to twelve-month accounting periods. 

The nature of the problem can be illustrated for 1949. For that year, Statistics of Income 
shows aggregate net income of the entire corporate system (excluding part-year returns) 
as $27,911 million. This total is made up of $22,208 million for the calendar-year returns, 
and various smaller amounts for the eleven fiscal-year periods ending July-November 
1949 and January—June 1950. While the bulk of the net income was earned by the calen- 
dar-year corporations, and hence pertains properly to 194, only part of the income re- 
ported for any one of the eleven fiscal-year periods actually pertains to 1949. Moreover 
some income earned in 1949 was presumably reported for eleven othex fiscal-year periods 
not included in the 1949 tabulation: periods ending January-June 1949 were tabulated for 
1948, and periods ending July-November 1950 were tabulated for 1950. The questions are: 
Can ary adjustments be applied to the figure tabulated for 1949 to allow for excessive in- 
clusion of net-income for the first set of eleven periods and total exclusion of the second 
set; and, would the adjusted figure be an improvement? 

I see no way to attempt such adjustments except by an estimated allocation of the total 
net income of the returns of any one fiscal-year period between two sections of that twelve- 
month period, one section falling within and the other falling outside of the calendar year 
to which the tabulation under study chiefly relates. In the absence of any factual guide, 
the allocation must apparently be made in terms of the time fractions involved. I there- 
fore make the following basic assumption: The aggregate net income of the returns for any 
particular fiscal-year period is earned during any section of that period in an amount pro- 
portional to the time length of that section. If, for example, the section is five months 
long, 5/12 of the annual income originates during that section. 

This assumption is almost surely unrealistic. We can discover numerous corporations 
for which evidence clearly shows that the great bulk of the year’s income originates in a 
single quarter, and some probably exist in which most of the income is produced in a 
single month. Conceivably, some of these seasonal peaks of income occur in different 
quarters for different corporations, with the result that the seasonal patiern of aggregate 
earnings for a group of corporations—such as those reporting in a particular fiscal-year 
period—may be somewhat smoothed and show a negligible tendency to peak. But this 
prospect may not be very high, for we have noticed that many corporations in a particular 
line of industry are likely to choose an identical fiscal-year period, and the factors produc- 
ing a seasonal peak for one such corporation are likely to have the same effect on others. 
We should remember also that various types of business fluctuation, besides variations 
which are strictly seasonal, affect the profit capacity of one quarter or month in different 
degrees than other parts of the year. I think we must regard as very low the probability 
that income is produced uniformly during the various quarters (or months) of a particular 
accounting period, for the aggregate of corporations using that period. 

For some very large corporations, most of them using the calendar year as a reporting 
period, we do have published figures on quarterly earnings, and, particularly for certain 
regulated enterprises such as railroads, even a monthly summary of earnings is published. 
One cannot escape the conclusion, after examining this type of evidence, that the creation 
of income does not proceed at a uniform pace throughout the accounting year. Moreover, 
in many instances, the figure for the final quarter (or month) of the year is heavily influ- 
enced by various year-end charges and credits: various elements affecting income cannot 
be satisfactorily allocated, even by the corporate management, among periods shorter 
than a year. This is merely a more acute aspect of the practical problem encountered in 
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allocations between one year and another. I think two conclusions are warranted. We 
cannot use, as a guide for the present purpose, published quarterly (or monthly) figures of 
large corporations because (1) such figures are very imperfect allocations of the annual 
earnings even for such a corporation, (2) large corporations are a poor sample of the entire 
corporate system which is under study here, and (3) large corporations are a particularly 
poor sample of fiscal-year corporations (see Part III). Even if the Treasury did call for 
quarterly or other more frequent reporting of profits from all corporations—large and 
small, fiscal-year and calendar-year—the practical difficulties of providing the accounting 
estimates within each corporation would probably be so great as to preclude any confident 
reliance upon the results, 

Proceeding nevertheless with the basic assumption, we undertake now the adjustment 
for 1949. The net income reported on calendar-year returns for 1949 needs no adjustment. 
An adjustment is, however, needed for each of twenty-two fiscal-yez: periods with ter- 
minal months ranging from January 1949 to November 1950. For three selected periods, 
among the twenty-two, the allocation ratios are as follows: 


Terminal Portion of Period in Tabulation Year: 
Month 1948 1949 1950 
January 1949 11/12 1/12 0 
October 1949 2/12 10/12 0 
November 1950 0 1/12 11/12 


All we need do is to apply the appropriate ratio of tabulation year 1949 to the aggregate 
net income of each fiscal-year period which terminates or starts in 1949. Because the peri- 
ods ending July-November 1950 are tabulated in Statistics of Income, 1950, the 1949 
adjustments cannot be carried out until the 1950 tabulations are complete, and this delay 
must be considered an adverse count against the adjustment. 

Applying the ratios as indicated we get estimates of the 1949 portion of the net income 
for each of the twenty-two periods. Summing these twenty-two figures, and adding the 
figure for 1949 calendar-year returns, yields the desired adjusted figure for 1949 net income 
of the entire corporate system (excluding part-year returns). That figure is $28,178 million, 
and is to be compared with the figure as originally tabulated at $27,911 million. The dis- 
crepancy——the “improvement” achieved by the adjustment—is only about 1 per cent. 
The corresponding figures for every year from 1940 to 1949 are (in millions of dollars) : 


Original Adjusted 
1940 8,972 9,013 
1941 16 ,038 16,013 
1942 22,981 22,839 
1943 27 ,678 27 ,693 
1944 26 ,183 26 ,197 
1945 21,051 21,093 
1946 24 ,533 24,268 
1947 30 , £62 30 ,908 
1948 34 ,027 34,022 
1949 27,911 28,178 


Except for 1946, all the discrepancies are smaller, both in amount and as a percentage, 
than that of 1949; and even the 1946 discrepancy is only slightly larger. 

In other words, in none of the last ten years is the change in net income resulting from 
the adjustment of any substantial significance. When we remember the very shaky basic 
assumption on which the adjustment rests and the delay in making the adjustment for 
any one year until the tabulation of the following year is complete, I think we are forced 
to conclude that the adjustment is not worthwhile. The tabulated figures probably do not 
give a dependable picture of income originating within specific calendar years, but the 
above adjustment does not necessarily yield even a minor improvement of the tabulated 
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figures. Whether, if the necessary facts were available, some other method of adjustment 
could be developed which would be a substantial improvement is a question which cannot 
usefully be examined without raising many preliminary questions about such facts. 

The above record was not extended back to 1928, on the chance that the very wide 
cyclical movements of that time might have led to more significant adjustments, because, 
as shown in Part I, fiscal-year reporting was much less common in those earlier years than 
in 1940-1949. Hence, even with wide cyclical fluctuations in profits, the adjustments of 
fiscal-year figures were unlikely to be substantial in eomparison with the very large calen- 
dar-year figure, which is not subject to any adjustment. Moreover, the present appendix 
is aimed solely at establishing that the net-income figures as now tabulated, however de- 
fective they may be, cannot be significantly improved by any currently feasible adjust- 
ment. 

The schedule of original and adjusted figures shows one remarkable result: The adjusted 
figures show smaller net changes than the original figures in the advances of 1940-1943 
and 1945-1948 and in the declines of 1943-1945 and 1948-1949. This is contrary to ex- 
pectation. The original figure for a year such as the 1945 low is presumably somewhat too 
high because it includes various fiscal years reaching back into 1944 and forward into 1946 
—both years of higher profits than 1945. The adjusted figure cuts down the weight of these 
fiscal-year constituents of the total, and might therefore be expected to show a sharper 
dip in 1945 than the original figure shows. The actual showing is contrary to this. A pos- 
sible explanation is the wide diversity in the practice of fiscal-year reporting among lines 
of industry and sizes of enterprise. The profit decline to 1945, and the following recovery, 
could have had differential effects—as to timing and intensity—upon these classes of coy- 
porations, with the net result that our adjustment did not happen to produce the expected 
outcome. 


APPENDIX C 


PART-YEAR ACCOUNTING PERIODS 
AND THEIR AVERAGE CENTER 


In the tabulation of Statistics of Income figures for any taxable year, part-year returns 
are included in most tables along with calendar-year and fiscal-year returns. The possible 
part-year accounting periods are extremely varied both as to length and as to terminal 
date. They may be separated into three broad groups: 


1. Those falling entirely within the specified calenaar year 

2. Those with a length covering an odd number of months, and falling partly in the 
specified calendar year and partly in the preceding or following calendar year 

3. Those with a length covering an even number of months, and falling partly in the 
specified calendar year and partly in an adjacent calendar year 


All returns of group 1 are included in tabulations for the specified year. Those returns 
in groups 2 or 3 which have the majority of their months within the specified calendar year 
are included. Those returns of group 3 which have an equal number of months in the 
specified calendar year and in the following year are included.® 

Those periods in group 1 covering an odd number of months have their centers at the 
fifteenth of the central month of the period; those periods covering an even number of 
months have their centers at the first of a month chosen so that the period is equally 
divided. Examination of the whole list shows the centers of the various possible periods of 
group | as follows: 





“ The first two statements are in accord with the general rule regularly published in S. of J. as to the assignment 
of part-year returns. The third statement is in accord with a letter from an official of the Bureau of Internal Rev- 
enue’s Statistical Division. 
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Number of Number of 
Center Periods Center Periods 
January 15 1 July 15 6 
February 1 1 August 1 5 
15 2 15 5 
March 1 2 September 1 4 
15 4 15 4 
April i 8 October 1 3 
15 4 15 3 
May 1 4 November i 2 
15 5 15 2 
June 1 5 December 1 1 
15 6 15 1 
July 1 5 Total 77 


If we assume that a return in group 1 is as likely to have any one of these seventy-seven 
periods as any other, we can regard the most likely distribution of periods of group 1 
among these seventy-seven centers, for any large number of part-year returns of this 
group as proportional to the above numbers. Those numbers are exactly symmetrical 
about July 1; hence, the average center of periods so distributed is July 1. This implies 
that the most probable average center of part-year periods in group 1 is July 1, but one 
weakness of the reasoning is that we may not have a sufficiently large number of part- 
year returns in this group to render the indicated distribution highly probable. 

A similar analysis of the thirty periods in group 2 leads to a corresponding conclusion 
for that group: Their most probable average center is at July 1, but the probability is not 
very high. And the same may be said of the twenty periods of group 3 for which more than 
half the months fall within the specified calendar year. 

For the remaining periods of group 3—periods covering an even number of months of 
which precisely half fall within the specified calendar year—no such conclusion holds. By 
the rule of assignment of such returns to the tabulating year, only those which overlap the 
end of the year are included. Those overlapping the beginning of the year are tabulated 
with the preceding year. The included periods (using 1949 as the taxable year) are: 


August 1949-May 1950 
September 1949—April 1950 
October 1949-March 1950 
November 1949—February 1950 
December 1949-January 1950 


All these periods center on January 1, 1950. This subgroup within group 3 therefore 
throws the probable average center of all part-year returns—including those of al] three 
groups—somewhat later than July 1, 1949. But, this subgroup includes only five periods, 
whereas the other subgroup of group 3 and groups 1 and 2 include a very large number of 
periods (127), and probably have average centers at July 1. This deviation from July 1 
is therefore very slight. 

The assumption is therefore made at various points in the text that the average center 
of all part-year returns is at July 1. The validity of this assumption is probably not very 
high, and it is less likely to be high in terms of some accounting item, such as net income, 
than in terms of number of returns. 
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APPENDIX D 


ESTIMATE OF FISCAL-YEAR 
TOTAL ASSETS FOR 1934 


As already indicated in Section 18, the 1934 tabulation of fiscal-year balance-sheet re- 
turns gives no accounting items but merely the number of returns. In connection with 
Table 3 of Section 4, however, an estimate of the total assets for all’fiscal-year balance- 
sheet returns was needed. This section explains that estimate. Table D-1 shows details of 
the estimate for the net-income category of the Manufacturing division. From Statistics 
of Income tabulations cited in Section 19, the figures of the first three columns of the tatle 
are obtainable. Division of the third column by the first yields average total assets for all 
returns, regardless of accounting period. To obtain the estimate of total assets for the 
fiscal-year returns of each size class, average total assets is multiplied by the second col- 
umn, and the result is shown in the final column. 

The All figure in the fina] column is not estimated directly from the All figures in the 
first three columns, but is the sum of the estimates. The reason is that the average total 
assets involved in such an estimate is probably not valid for the fiscal-year returns: the 
shape of the size distribution, as noted in Section 17, is such that the average total assets 
may not be highly typical. The situation'is different for the specific size classes, except 
possibly the open-end class with lower lirait of $50 million. For each such size class, be- 
cause of the limited boundaries, the aver'ige is fairly typical, and that average, as com- 
puted for all returns, can ordinarily be :ccepted as approximately valid for the fiscal- 
year returns. The major exceptions occur when the number of returns in a size class is 
very smal], but even here the class avera;re of total assets is probably more typical than 
the over-all average for all size classes com ined. 


TABLE D-1 


ESTIMATE OF AGGREGATE TOTAL ASSETS IN EACH ASSETS SIZE CLASS 
AND FOR ALL SIZE CLASSES COMBINED, FOR FISCAL-YEAR 
RETURNS IN THE NET CATEGORY OF THE MANU- 
FACTURING DIVISION, 1934 


(dollars in thousands) 





Number of Returns Total Assets 

Lower Limit Fiscal- All, Fiscal- Year, 
of Size Class All Year Actual Estimated 
0 12,097 1,997 275 ,286 45 ,445 

50 5,615 1,055 405 ,219 76 ,136 

100 6,348 1,154 1,020 ,093 185 ,442 

250 3,500 760 1,243 ,998 270 ,125 

500 2,384 546 1,667,157 381,824 

1,000 2,401 513 5 ,073 ,436 1,083 ,995 
5,000 356 81 2,495 ,612 567 ,822 

10 ,000 316 59 6,307,941 1,177,748 

50 ,000 73 13 10 ,446 ,973 1,860 , 420 
All 33 ,090 6,178 28 ,935 ,715 5,648 ,957 


The estimate must be carried out separately for the net and no-net categories because 
ordinarily, since the size distribution of no-net returns is more steeply J-shaped than that 
of net returns, the class averages for the no-net category run somewhat lower than those 
for the net category. The estimates from the no-net category size-class figures are shown in 
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Table D-2, and yield just under $3 billion for all size classes combined. If we combine the 
All figures for the net and no-net categories, we have $8,644,659,000 as the estimated total 
assets on all fiscal-year returns in Manufacturing in 1934. 

The results of similar estimates for each division and group and all divisions combined 
appear in Table D-3. One might suggest that a more precise estimate for Manufacturing 
than the $8,645 million could be obtained by combining the estimates of the thirteen 
manufacturing groups; this would yield $8,817 million. Likewise, instead of the direct 
estimate of $26,770 million for all divisions combined, one might add the specific estimates 
for the nine broad divisions, with or without the above adjustment in Manufacturing. 
The result would then be $25,602 million with such adjustment, or $25,430 million without. 
After consideration of various matters affecting these alternative estimates, I decided to 
use the direct estimate of $26,770 million for Table 3 of Section 4. 


TABLE D-2 


ESTIMATE OF AGGREGATE TOTAL ASSETS IN EACH ASSETS SIZE CLASS 
AND FOR ALL SIZE CLASSES COMBINED, FOR FISCAL-YEAR RETURNS 
IN THE NO-NET CATEGORY OF THE MANUFACTURING DIVISION, 1934 


(dollars in thousands) 


Number of Returns Total Assets 
Lower Limit Fiscal- All, Fiscal-Year, 
of Size Class All Year Actual Estimated 


31,454 5 ,032 536 ,710 85 ,863 
6 ,837 1,176 486 ,073 83 ,607 
6 ,675 1,147 1,057 ,978 181,798 
3,163 614 1,108 ,967 215,272 
1,951 1,367 ,034 274,668 
1,808 3,729 ,624 748 ,813 

254 1,761,284 284 ,302 
211 4,376 ,524 788 , 189 
55 9,162,730 333 ,190 


23 ,586 ,924 2,995 , 702 
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TABLE D-3 


ESTIMATED AGGREGATE TOTAL ASSETS REPORTED ON FISCAL-YEAR 
RETURNS IN 1934, BY INDUSTRIAL DIVISIONS AND GROUPS 


(thousands of dollars) 


All divisions combined 
Agriculture 
Mining and quarrying 


Manufacturing 8,644,659 
Food 1,805,414 
Beverages 185 ,449 
Tobacco 12,523 
Textiles 1,262,612 
Leather 341,183 
Rubber 4 243,117 
Forest products 256 , 251 
Paper 431,291 
Printing 212 ,248 
Chemicals 1,706 ,833 
Stone 132,218 
Metal 1,909 ,720 
Other manufacturing 317 ,672 


Construction 194 ,443 
Public utilities 2,205,911 
Trade 4,582,530 
Service 1,521,215 
Finance 7 ,482 ,647 
Not allocable 7,442 
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All communications concerning this section should be addressed to the Ab- 
stracts Editor, Professor W. L. Smith, Department of Statistics, University of 
North Carolina, Chapel Hill, North Carolina. 


Andrews, F. C. and Chernoff, H., “A large-sam- 
ple bioassay design with random doses and un- 
certain concentration,” Biometrika, 42 (1955), 
307-15. 

There are available a limited quantity of a 
suspension of bacteria of unknown concentra- 
tion, and a limited number of test animals. It is 
required to estimate the virulence of the bacteria, 
defined as the probability that a single organism 
will induce a response. Part of the experimental 
material must be used to estimate concentration 
and the remainder allocated to the test animals. 
The paper works out the appropriate division of 
effort; a preliminary estimate of the virulence is 
required but this need only be very approximate. 
D. R. Cox, University of North Carolina. 


Armsen, P., “Tables for significance tests of 
2X2 contingency tables,” Biometrika, 42 (1955), 
494-511. 

Tables are given for a two-sided form of Fish- 
er’s exact test for a 2X2 contingency table. The 
tables are preceded by a careful discussion of 
how to build up the two tails corresponding to a 
preassigned significance level. D. R. Cox, Uni- 
versity of North Carolina. 


Bailey, Norman T. J., “Some problems in the 
statistical analysis of epidemic data,” Journal 
of the Royal Statistical Society, Series B, 17 
(1955), 35-58. 

The author first reviews and summarizes 
previous work on deterministic and stochastic 
models of epidemics. His subsequent purpose is 
to explore further some of the problems con- 
nected with epidemics within small groups, 
“such as family or household groups where 
homogeneous mixing is a reasonable assump- 
tion.” For two chain binomial models in which 
the incubation period is constant and the infec- 
tious period a point, the author gives the fre- 
quencies of numbers of newly infected persons at 
the stages of the chain for families of five and less 
when the epidemic is started by the introduction 
of one infective or of several simultaneously. The 
effects of variable changes of infection between 
families and between individuals are investigated 
and the results for families of three are shown 
explicitly; a model with the probability of infec- 
tion varying among families according to a 
beta-distribution provides a satisfactory fit to 
some measles data. Estimates of the periods of 
incubation and infectiousness are derived in a 
model for households of two susceptibles when 
the disease has a normally distributed latent 
period, followed by a constant period of infec- 
tiousness. E. S. Paas, University of Durham, 
England. 


Banks, C., “The factorial analysis of crop pro- 
ductivity,” Journal of the Royal Statistical 
Society (B), 16 (1954), 100-11. 

The author presents an extensive “re-exam- 
ination of Professor Kendall’s data” as given by 
Kendall (1950) in a paper on “Factor analysis as 
a statistical technique.” Kendall used the method 
of principal components to find a general factor 
of productivity, indicating analogies and differ- 
ences between his approach and the factor an- 
alysis methods used by psychologists. Banks uses 
the same data but carries the analysis by the 
methods of psychologists a few stages further 
and then examines Kendall’s criticisms and sug- 
gestions. The data is analyzed by three differe't 
factor analysis methods: (i) weighted summation 
with variances unity; (ii) simple summation 
with variances unity; (iii) simple summation 
with reduced variances. Method (i) is shown to 
be algebraically equivalent to Kendall’s method 
for finding a single factor of general productivity. 
The other methods are suggested as alternatives 
that yield very close approximations to the re- 
sults obtained by the more rigorous procedure 
without entailing nearly so much labor. R. N. 
Penperorass, Virginia Polytechnic Institute. 


Bennett, J. H., “The distribution of hetero- 
geneity upon inbreeding,” Journal of the Royal 
Statistical Society (B), 16 (1954), 88-99. 

“In this paper, it is shown how the variance 
of the map length of the germ plasm, heterogene- 
ous after a given amount of inbreeding, can be 
calculated. Matrix methods are used in the eval- 
uation of the variance for several systems of in- 
breeding after a large number of generations. 
The frequency distribution of the map length 
heterogeneous after many generations is devel- 
oped in terms of the modified Bessel function of 
the first kind, the probability of complete homo- 
geneity being represented by a condensation at 
the origin. A comparison of these distributions 
for sib and parent-offspring matings, two in- 
breeding systems which in general give the same 
speed of approach to homogeneity at a single 
genetic locus, reveals interesting differences in 
the probability statements than can be made in 
these two cases.” (From author’s summary). 
Crrps Y. Kramer, Virginia Polytechnic Insti- 
tute. 


Bracewell, R. N., “Correcting for running means 
by successive substitutions,” Australian Journal 
of Physics, 8 (1955), 329-34. 

It often happens that we are interested in a 
function f(z) and have access to only some 
weighted integral of f(z). Of special importance 
is the case of the running mean, where we ob- 
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serve the average over an interval symmetri- 
pe ci en gm ene 
mate f(z) by constructing the running mean of 
the observed function and take the difference 


between the result thereof and the observed | 


function itself as a correction to the observa- 
tions. The process may be repeated and leads to 
a@ sequence of successive estimates. These con- 
verge to the original function f(z) under very re- 
strictive conditions on the function. Anyhow, 
the first few estimates may approach the correct 
function despite the fact that the procedure 
finally becomes divergent. This is illustrated by 
a numerical example, where f(z) is a bell-shaped 
curve and the average is taken over a length 
which is about } of the width of the bell. Lans- 
Hennine Zerrerserc, University of Chicago. 


Bradley, R. A., “Rank analysis of incomplete 
block designs; III Some large-sample results on 
estimation and power for a method of paired 
comparisons,” Biometrika, 42 (1955), 450-70, 
This paper contains a comprehensive investi- 
eh etn ene ae 
method of dealing with paired comparison data 
by Bradley and Terry, Biometrika, 39: 
324 (1952). D. R. Cox, University of North 
Carolina. 


Box, G. E. P. and Andersen, S. L., “Permutation 
theory in the derivation of robust criteria and 
the study of departures from assumption,” 
Journal of the Royal Statistical Society, Series B, 
17 (1955) 1-26. 

Tests on means using the analysis of variance 
are affected only little by nonnormality of the 
parent population (the tests are “robust”) but 
tests on variances which assume normality (e.g. 
F, Bartlett’s M) can be “so misleading as to be 
almost valueless” if the parent kurtosis differs ap- 
preciably from normal] kurtosis. The authors seek 
tests which are insensitive to departures from 
assumptions by convenient approximations to 
tests based on permutation theory. For compar- 
ing two variances, they show that the usual F 
criterion with degrees of freedom modified by a 
factor depending on the sample kurtosis is much 
less affected by nonnormality than the unmodi- 
fied test. When the distribution is normal, the 
power of the modified test is slightly less than 
that of the usual test. A similar modification is 
given for Bartlett’s @. The paper is followed by 
a discussion containing a steepest descent 
method of approximation to permutation dis- 
tributions (H. E. Daniels), aad an indication 
that tests of dispersion based on the range are 
more robust than those using ».m.s. (D. R. Cox). 
E. 8. Paes, University of Durkam, England. 


Chapman, D. G., “Population estimation based 
on change of composition caused by a selective 
removal,” Biometrika, 42 (1955), 279-90. 
Suppose that it is required to estimate the 
number of individuals in a population. Let the 
individuals be divided into two or more groups, 
for example males and females, and let the pro- 
portion of “males” be estimated from a random 
eample. Then let known numbers of individuals 
be removed from the population, the proportions 
removed being different for the two “sexes.” 


Cvetkov, B., “A new method of computation in 
the theory of least; squares,” Australian Journal 
of Applied Science, & (1955), 274-80. 


problems, a common 
and useful procedure is the successive orthog- 
onalization of the “independent variables.” 


with the addition of further side conditions but 
not with additional data, as asserted on p. 276. 
The procedure is applied to several geodetic 
examples. Davin L. Watiacn, University of 
Chicago. 


David, H. A., “A note on moving ranges,” Biome- 

trika, 42 (1955), 612-15. 

saves a sample m,-~-~-, 2, ordered in time, 
is P 

2 ths tomas of tain to eoteuaned tak tied 

over all i, the result is called the mean moving 

range of samples of n. Similarly for other statis- 


of North Carolina. 


Fisher, Sir Ronald, “Statistical methods and 
scientific induction,” Journal of the 
Statistical Society, Series B, 18 (1955) 69-78. 
The author discusses the differences between 
acceptance procedures and significance tests and 


of certain ideas useful in the former. The 


popula 

“inductive behaviour.” Examples 

regression coefficient and on a 2X2 table are 
shown to “violste the rule of determining 


requirements of inductive inferences. E. 8. Paaz, 
University of Durham, England. 
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Foster, F. G. and Stuart, A., “Distribution-free 
tests in time-series based on the breaking of 
records,” Journal of the Royal Statistical Society 
(B), 16 (1954), 1-22. 

Two simple statistics based on records are 
proposed as tests of the hypothesis that n obser- 
vations have been independently drawn from the 
same continuous distribution. These statistics 
are the sum (s) and difference (d) of the numbers 
of upper and lower records in the series. They 
are uncorrelated and asymptotically normally 
distributed. One statistic provides a consistent 
test against trend in the mean and the other a 
consistent test against trend in the variance. 
The former test was found to be considerably 
more powerful than three well-known tests. It is 
not as powerful as the rank-correlation tests but 
has the advantage of being simpler to compute. 
Illustrative applications are made to data from 
meteorology and athletics. Leo Lyncu, Virginia 
Polytechnic Institute. 


Foster, F. G. and Teichroew, D., “A sampling 
experiment on the powers of the records tests 
for trend in a time series,” Journai of the Royal 
Statistical Society (B), 17 (1955), 115-21. 

Two statistics based on the breaking of records 
are described in a paper by Foster and Stuart 
(1954), “Distribution free tests in time-series 
based on the breaking of records.” These two 
statistics provide distribution free tests of the 
randomness of a series of observations. “Powe: 
functions at the 5 per cent level are given against 
the alternative that the samples arise from a 
normal distribution with a positive linear trend in 
the mean and a constant variance.” These 
power functions resulted from a sampling experi- 
ment carried out by the authors on a National 
Bureau of Standards Western Automatic Com- 
puter in the Institute for Numerical Analysis, 
Los Angeles. 

The present paper by Foster and Teichroew 
describes this sampling experiment which had 
60,000 samples, excluding recomputations, and 
contained nearly four million normal deviates. 
The entire results of the experiment are con- 
tained in three tables. Harotp A. Stiiu, Virginia 
Polytechnic Institue. 


Gurland, John, “On regularity conditions for 
maximum likelihood estimators,” Skandinavisk 
Aktuarietidskrift, 1954, pp. 71-6. 

The usual sufficient conditions for a maximum 
likelihood estimate to be consistent, asymptoti- 
cally normal and asymptotically efficient are 
simplified. In particular, the need for the exist- 
ence of third order derivatives is eliminated. 
Joseru A. Dusay, University of Chicago. 


Hammersley, J. M. and Morton, K. W., “Poor 
man’s Monte Carlo,” Journal of the Royal Sta- 
tistical Society (B), 16 (1954), 23-38. 

The authurs attempt to destroy the miscon- 
ception that Monte Carlo methods demand elab- 
erate, expensive electronic equipment. Three 
examples are presented illustrating how Monte 
Carlo methods can be employed by the poor man 
using only pen and paper. The first example, 
from nuclear physics, illustrates how to deter- 
mine the critical size of the chain-reacting system 
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of a nuclear reactor. A free neutron, put into the 
core, performs a random walk determined by 
three probability distributions depending on the 
energy of the neutron. The number of neutrons 
arising from collisions and fissions is tallied for 
various core sizes and the critical size of the 
chain-reacting system is then estimated by inter- 
polation. The second example, taken from the 
field of archaeology, is study of the diameters of 
Druid circles in Western Scotland. The last 
example is a chemical problem dealing with some 
problems of self-avoiding walks. If we denote a 
walk by W, and let R, denote a self-avoiding 
W,, then R, is called a Meyer molecule. The 
problem attacked by the authors is to deter- 
mine K(n) for large n where e** is defined to be 
the total number of distinct R,. The solution ap- 
pears to be beyond analytical resources, but can 
be handled by Monte Carlo methods. 

A discussion is presented on how the Monte 
Carlo work can be reduced by choice of sampling 
technique, choice of recording technique, and by 
inversely restricted sampling. The authors illus- 
trate some principles of Monte Carlo computa- 
tion by comparing a poor computing layout with 
a better layout. R. E. Wapoue, Virginia Poly- 
technic Institute. 


Hammersley, J. M., and Morton, 2 
“Transposed branching processes,” Journal of 
the Royal Statistical Society (B), 16 (1954), 76-9 


“We shall show in this paper how, in studying 
the growth-rate of a deterministic branching 
process a simple change of viewpoint, which is 
quite trivial theoretically, may nevertheless lead 
to surprising computational simplifications and, 
in the course of this, we shall indicate a basis on 
which the generated population can sometimes be 
reclassified into fewer classes.” [From author's 
summary.}] Ricwarp A. Srewarr, Virginia 
Polytechnic Institute. 


Hannan, E. J., “An exact test for correlation be- 


tween time series,” Biometrika, 42 (1955), 316-26. 

The asymptotic relative efficiencies of various 
tests for correlation between two time series are 
considered. A test proposed by Quenouille, 
Journal of the Royal Statistical Society, B, 11: 
68 (1949) comes out best. D. R. Cox, University 
of North Carolina. 


Hoel, Paul G., “On a property of the sequential 
t-test,” Skandinavisk Aktuarietidskrift, 1954, 
19-22. 

In his book on sequential analysis, Wald sug- 
gests a sequential ¢-test and shows that, to within 
good approximation, this test possesses a desir- 
able minimax property. Hoel shows by a counter- 
example that this property does not hold exactly. 
Hersert T. Davin, University of Chicago. 


Huitson, A., “A method of assigning confidence 
limits to linear combinations of variances,’ 
Biometrika, 42 (1955), 471-79. 

A problem that often arises in work with com- 
ponents of variance is to obtain from independ- 
ent estimates s*,--- , 3? of variances 0)’, - 

o;* a confidence lenectel for a linear combination 
of the o?’s. A series expansion determining such 
a confidence interval is worked out, based on 
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normal distribution theory, and tables and a 
numerical example are given. D. R. Cox, Uni- 
versity of North Carolina. 


Huzurbazar, V. S., “Confidence intervals for the 
parameter of a distribution admitting a sufficient 
statistic when the range depends on the param- 
eter,” Journal of the Royal Statistical Society 
(B), 17 (1955), 86-90. 

A variate x has a probability density /(z, 
6) =9(z)/h@), with g(z)20 and a@)SzSb@), 
where a(@) and 6(@) change monotonically in 
opposite directions, or one is constant, when the 
parameter @ varies. The maximum likelihood 
estimator @ of @ from n independent observations 
is a sufficient statistic. The present paper shows 
in effect that Pr[h(@) SA(T)]=hA(T)*/h(@)" when 
T has any value for which 0SA(T) SA(0). This 
result is used to obtain confidence limits for @ in 
three examples. M. C. K. Tweepre, Virginia 
Polytechnic Institute. 


Jowett, G. H., “Least squares regression an- 
alysis for trend-reduced time-series,” Journal of 
the Royal Statistical Society, Series B, 17 (1955), 
91-104. 


Scott, J. F., and Small, V. J., “A numerical in” 
vestigation of least squares regression involving 
trend-reduced Markoff series,” ibid., 17 (1955), 
105-14. 

Let z(t), y(t) be time series, supposed to be 
structurally related by y(t)=a+fzx(t)+e(0), 
where a, 8 are constants and ¢(é) varies inde- 
pendently of x(¢). In the first paper the difficul- 
ties of obtaining a valid estimate of the standard 
error o of the regression coefficient b, are dis- 
cussed. The suggested method of estimating 
% is as follows: smooth the series z(t), y(t) using 
the same operator for each and subtract the 
smoothed series from the originals to give the 
two series of trend-reduced residuals X(i), Y(t). 
Then 


. EX(OYO 
zX*(t) 


and of? = key”, 


where k is a constant depending on the smoothing 
operator and the autocorrelations of z(¢), «(¢), 
and a," is the variance of b if X(t), Y() were 
series of independent terms. Next fit a trend- 
reduced Markoff serial variation of autocorrela- 
tion function to X(t), and to Y(t)—bX(, and 
hence calculate k. The second paper investigates 
two types of trend-reducing operator, and gives 
tables by which a suitable operator may be se- 
lected; further tables assist in the calculation of 
k. E. 8. Paar, University of Durham, England. 


Mauldon, J. G., “Pivotal quantities for Wishart’s 
and related distributions, and a paradox in 
fiducial theory,” Journal of the Royal Statistical 
Society (B), 17 (1955), 79—85. 

Through a sequence of simple matrix trans- 
formations, given a standard p-variate normal 
distribution, Mauldon derives two theorems, one 
for the population and one for a sample, on the 
independence of positive upper triangular ma- 
trices, K, defined by K’K=A, the dispersion 
matrix. From these theorems are deduced pivotal 
quantities (redefined elements of K) for A for a 
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general p-variate normal, again for the popula- 
tion and for a sample, when the second order 
moments are taken about the popuiation mean 
and/or when the sample is small. The sampling 
distribution of the elements of this A is then 
brought to Wishart’s distribution. Confidence 
regions for K are indicated. A note is given re- 
garding a contradiction in the fiducial distribu- 
tion of the matrix of second moments about the 
population mean with alternate pivotal quanti- 
ties shown. The simplest case of this contradic- 
tion is worked out with numerical probabilities. 
R. H. Rirrensuran, Virginia Polytechnic Insti- 
tute. 


Moran, P. A. P., “Some experiments on the pre- 
diction of sunspot numbers,” Journal of the 
Royal Statistical Society (B), 16 (1954), 112-7, 

Representation of the series of sunspot num- 
bers ax a stationary ic process is consid- 
ered. Such a process is generated by a relation- 
ship of the form. 


my — m = a(t, —m) +°** + ac(zer —m) +e 


where « is a series of random variates with zero 
means and identical distributions, often taken 
to be Gaussian. An anlaysis of the partial serial 
correlation coefficients indicates that k may be 
taken to have the value 2, so that an optimum 
(in the least squares sense) linear predictor of the 
form 


ze — m = ai(ry, — m) + as(zy-2 — m) 


may be used. The standard error of prediction is 
15.8 for one year ahead, increasing to 34.1 for 
four years ahead. These results are disappointing 
in view of the fact that the standard deviation 
of the whole series is only 36.27. 

Use of a non-linear predictor does not satis- 
factorily decrease the standard error of predic- 
tion. Certain empirical methods seem to be as 
gvod or better, and it is concluded that sunspot 
numbers are not adequately described by an 
autoregressive scheme. Inwin Miuer, Virginia 
Polytechnic Institute. 


Nordbotten, Svein, “On the determination of an 
optimum sample size,” Skandinavisk Aktuarie- 
tidskrift, 1954, 60-4. 

Suppose in estimating the mean of a finite 
population that the error and cost of measure- 
ment of each sampled element depends on a con- 
trollable parameter t, which might represent the 
time spent in the measuring process. The paper 
considers the optimum determination of ¢ and 
the sample size to give maximum accuracy sub- 
ject to a fixed total budget. An explicit solution 
is obtained for the special case in which the indi- 
vidual errors of measurement are uncorrelated, 
with means inversely proportional to ¢, and with 
a constant variance. The sample mean is taken 
as estimator and its accuracy is measured by 
mean square error. The cost is the sum of a fixed 
overhead cost and an operating cost proportional 
to the product? X sample size. Joserpn A. Dusay, 
University of Chicago. 


Page, E. S., “A test for a change in a parameter 
occurring at an unknown point,” Biometrika, 
42 (1955), 523-7. 
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Independent observ-tions ,***, 2, are 
available in the order in which they are obtained. 
It is suspected that at som» unknown point the 
population changed, i.e. egies hat cpg 
sample from one population and 241, 

a sample from another population, with m ow 
known. A test of the significance of the change 
is given when the observations are 0 or 1 (bi- 
nomial trial) and the initial probability of 0’s is 
known. D. R. Cox, University of North Carolina. 


Stuart, A., “A test for homogeneity of the mar- 
ginal distributions in a two-way classification,” 
Biometrika, 42 (1955), 412-6. 

Consider an nXm contingency table and let 
it be required to give a significance test of the 
null hypothesis that the marginal distribution of 
probabilities among the m rows is the same as 
that among the n columns. A large-sample test 
is developed and illustrated with a numerical 
example. D. R. Cox, University of North Caro- 
lina. 


Tukey, J. W., “Interpolation and approxima- 
tions related to the normal range, Biometrika, 
42 (1955), 480-5. 

Semi-empirical formulas are given for the 
mean, variance, and percentage points of the 
range of random samples drawn from a normal 
population, and for the ratio of the sample range 
to the sample standard deviation. D. R. Cox, 
University of North Carolina. 


Watson, G. S., “The distribution of the ratio of 
two quadratic forms,” Australian Journal of 
Physics, 8 (1955), 402-7. 

Let A and B be real, symmetric and commuting 
matrices of order 2nX2n, each of which has its 
latent roots equal in pairs. The two quadratic 
forms l=2’Az and m=2’Bz in normal inde- 
pendent variables may then each be expressed 
as a linear combination of n independent I- 
variables of orrler 1. A study is made of the ratio 
i/m with B positive definite and with some re- 
strictions imposed on the latent roots of the 
matrices. An expression of the exact distribution 
of the ratio is derived starting from the joint 
characteristic function of | and m. This function 
is expressed as a weighted sum of characteristic 
functions for l- and m-variables when these are 
linear combinations of only two I’-variables. The 
inverse Fourier transform then readily gives the 
density for the joint variables | and m. Further 
integration leads ‘. expressions for the density 
and distribution functions of the ratio 1/m. An 
alternative form for the distribution function is 
derived spplying a method by Box. The former 
formula is more convenient for the calculations 
of the exact moments of I/m. Lars-Hennine 
Zurrerserc, University of Chicago. 


t 
ships,” Biometrika, 42 (1955), 360-81. 
Numerous significance tests connected with 
normal discriminant theory and normal linear 
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functional relationships are set out in a very clear 
problems 


avoiding the determination of latent roote and 
latent vectors necessary for finding the 
“best” discriminant function or “best” linear 
relationship. D. R. Cox, University of North 
Carolina. 


Wurtele, Zivia S. “A rectifying inspection plan,” 
Journal of the Royel Statistical Society, Series B, 
17 (1955), 124-7. 

A rectifying inspection scheme is required 
which permits Do or fewer defectives in a lot, and 
which ensures that the probability of an accepted 
lot containing more than Dp» defectives be at 
most +. Such a scheme can be specified by an 
acceptance on a defecti 
sampled chart. For the limiting case where the 
sample size is infinite, the author calculates the 
boundary points for one such scheme. E. 8. 
Pagan, University of Durham, England. 


Yates, F., “A note or the application of the 
combination of probabilities test to a set of 2X2 
tables,” Biometrika, 42 (1955), 404-11. 

This is connected with the problem described 
in the immediately preceding abstract. A method 
of testing significance of the treatment di.ference 
but not of estimating its magnitude, is to calou- 
late x? from the 2X2 contingency table formed 
from each pair of units, and to combine in some 
way the resulting quantities. This is useful in 
quick analyses. 

The recommended method is to calculate x? 
uncorrected for continuity for each 22 table 
and to add the resulting signed values of x. 
Reasons are given for believing that the com- 
bination test based on the probability integral 
transformation of x’ is not very efficient. D. R. 
Cox, University of North Carolina. 





Yates, F., “The use of transformations and maxi- 
moum likelihood in the analysis of quantal experi- 
ments involving two treatments,” Biometrika, 
42 (1955), 382-403. 

This paper deals with the analysis of a paired 
comparison experiment (for two treatments) in 
which the observation on each experimental unit 
i . r “successes” out of n trials 


whether the treatment effect is the same for all 
pairs and of whether the residual variation is 
binomial. 

Suitable mathematical specifications of the 
data are set up, aliowing for between-pair varia- 
tion, and the parameters estimated by maximum 
likelihood using the method of efficient scores. 
In certain cases, simple 
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How to Lie with Statistics. Darrell Huff. New York: W. W. Norton & Co., Inc., 1954. 
Pp. 142. $2.25. Paper. 


Jerome B. ConeEn, College of the City of New York 


I’ you don’t know how to statisticulate or about the sample with the built-in bias 
or how to talk back to a statistic, consult Darrell Huff. His slender compendium 
of hucksterish fluff built around Disraeli’s thesis that “there are three kinds of lies: 
lies, damned lies, and statistics,” is good fun, amusingly written. 

In a frothy Madison Avenue style, Huff sets out to explore and deplore the misuse 
of statistics. In ten neat little chapters he examines the now standard techniques for 
misusing statistics—the sample that’s anything but random, index numbers to prove 
anything, the use of the kind of average that distorts most, jumping to unwarranted 
cause and effect conclusicus based on high correlations, etc. Incidentally, Huff points 
out that 187 years cf publishing experience went into his book (add up ages of 
author, editor, illustrator, printer and binder!). 

“Figures don’t lie, but liars figure” is an old saw, but as Huff applies it to advertising 
it comes alive again. “If you can’t prove what you want to prove,” Huff declares, 
“demonstrate something else and pretend they are the same thing. In the daze that 
follows the collision of statistics with the human mind, hardly anybody will notice 
the difference.” For example, “you cannot prove that your nostrum cures colds, but 
you can publish (in large type) a sworn laboratory report that half an ounce of the 
stuff killed 31,108 germs in a test tube in eleven seconds. While ycu are about it, 
make sure that the laboratory is reputable or has an impressive name.” Reproduce 
the report in full, he suggests. Photograph a doctor-type model in white clothes and 
put his picture along side. Don’t mention the gimmicks in the story, of course. Don’t 
for heaven’s sake, explain that the antiseptic was used full strength in the test tube 
but is sold diluted 100-fold because full strength it would burn your throat tissues. 
Don’t say what kind of germs it killed because who knows what causes colds. Prob- 
ably it isn’t a germ at all. If you doubt that something so obvious would fool the 
public, remember the political candidate in Florida not long ago who made consider- 
able capital by accusing his opponent of “practicing celibacy,” or the New York 
exhibitor of the motion picture Quo Vadis who used huge type to quote the NV. Y. 
Times as calling it “historical pretentiousness,” or the makers of Crazy Water Crys- 
tals, a proprietary medicine, who advertised it as a “quick, ephemeral relief.” 

At first one is tempted to urge “make this required reading in your introductory 
statistics course.” It will be a good tonic, or antidote as the case may be, for students. 
But then a better use suggests itself. Buy the paper-bound edition (it doesn’t cost 
much) and cut it up; paste appropriate sections into your notes. When you talk about 
averages talk about the misuse of averages; when you reach correlation discuss spuri- 
ous correlations as well as valid ones. Use Huff’s puckish illustrations and your re- 
vised notes may not only lead (mirabile dictu) to some slight manifestation of student 
interest but also diminish your own boredom at having to reteach, so often, the 
elementary and the obvious. " 
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Statistical Methods. Third Edition. F. C. Millis. New York: Henry Holt and Co., 1955. Pp. 
viii, 842. $6.95. 


Epwin 8. Mrits, Massachusetts Institute of Technology 


Viena every chapter has been rewritten and modernized for the ‘third edition 
of this standard introduction to statistics. Both the arrangement of chapters 
and the emphasis placed on various topics have been altered considerably, re- 
sulting in a more satisfactory and unified exposition. Most important, the discus- 
sion of probability and statistical inference has been expanded and allocated the 
central place now typical in better texts. These chapters present the rudiments 
of probability theory, the normal and binomial distributions, point and interval 
estimation, and the Neyman-Pearson theory of hypothesis testing. The gain in- 
volved in this arrangement is considerable, but it cannot really be said that the rest of 
the book has been integrated with these chapters. Seldom are the specific tests and 
estimates which are presented in later chapters discussed in terms of the general 
considerations presented in the chapters on “foundations.” For instance, among the 
criteria discussed for a good estimator (lack of bias, efficiency, etc.), only bias is 
ever used in succeeding chapters, and the concept of an unbiased test, defined on p. 
211, is never mentioned again. 

Besides the major change mentioned above, several new topics are treated in the 
third edition and others are discussed more fully. Even more emphasis than before 
has been placed on regression analysis, x? now occupies a full chapter, and a new 
chapter on sample surveys has been acded. The section concerned with time series 
has been expanded, notably by a fuller discussion of index numbers and the addition of 
a chapter on the NBER techniques of measuring business cycles. 

The examples and illustrative material have been largely brought up to date, and 
again demonstrate the advantage of teaching statistics with reference to a specific 
subject matter. As before, examples are numerous and deal with interesting and rele- 
vant material. There are useful new appendixes on sources of economic statistics and 
computational techniques. References to the literature are numerous and have been 
brought up to date, but would be more useful to students if they were graded ac- 
cording to difficulty. 

In a volume this size it is inevitable that there should be a few slips and misprints, 
and there will be occasional annoyances for the scrupulous reader, such as the au- 
thor’s insistence on using “most probable” where “expected value” is meant, and 
statements like that on p. 550 that the null hypothesis for the F test is s;?=s,?=o". 
However, the major criticism of the book is surely the author’s virtual dismissal of 
regression techniques in the analysis of time series. After quite properly allocating 150 
pages to regression and correlation he rejects the application of these methods to 
time series, apparently because of “lack of independence among observations.” The 
three chapters on time series then concentrate on quite different techniques, such as 
moving averages and the NBER measures, which are not really statistics at all in 
the sense that no attempt is made to relate them to probabilistic considerations. In 
fact, of course, economists do apply classical regression techniques in the estimation 
of relationships among ‘time series, and probably no less successfully than other 
applications of statistics to economic data. In any case, the subject is too important 
to ignore and a relatively elementary discussion of the main difficulties together with 
practical sugestions for getting the best estimates possible, tests of randomness, 
etc., should not be impossible for someone with Mills’ evident expositional skill. 

The third edition of this text is clearly a greatly improved volume, and Mills has 
been largely successful in his aim of providing a readable textbook of statistical tools 
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with a minimum of proofs and technical discussion. In fact, there are very few sta- 
tistical techniques not found in this book which the economist requires before he is 
turned loose on the world. If he is to apply the tools with discrimination, however, he 
will require a good deal more background in formal probability theory than is avail- 
able here. 


A Manual of Problems in Statistics, Revised Edition. Scott Dayton. New York: Henry 
Holt and Co., 1955. Pp. v, 137. $1.95. Cardboard. 


Eimer B. Mops, Boston University 


8 1s stated in the preface, “This manual of problems has been written as a com- 
A panion volume to Statistical Methods (3d ed., 1955) by Frederick C. Mills.” There 
are brief sections of introductory exercises designed to review formulas, logarithms, 
graphs of equations, and equation solving. There are in order, sections on the De- 
scription of the Frequency Distribution, the Normal Curve of Error, Estimation and 
Tests of Hypotheses, and Simple Correlation. The next five sections, on Secular 
Trend, Seasonal Variation, Cyclical Fluctuations, Price Indexes, and Volume Indexes 
clearly indicate that the manual is designed primarily for those interested in business 
statistics. The problems are confined almost entirely to the field of business and 
economics with none drawn, for example, from the fields of biology, medicine, educa- 
tion, or psychology. The last three sections of the book deal with Chi-square, Vari- 
ance Analysis, and Multiple and Partiai Correlation. 

The Appendix contains tables of the usual character pertaining to the normal 
curve, t, r, r and 2’, chi-square, and F. There are also powers of numbers, a page of 
random numbers, and a five-place table of common logarithms. 

An excellent feature of the manual is the clarity with which each problem is pre- 
sented. Where interpretations are required, the author insists upon an answer phrased 
in terms of the specific data and upon a statement of the limitations involved. Any 
teacher of experience knows the difficulties encountered in these respects. The prob- 
lems are realistic with enough variety within types to provide for colleges where two 
or more sections of statistics are given. 


og amg Concepts. Joe Kennedy Adams. New York: McGraw-Hill, 1955. Pp. xvi, 
304. $5.50. 


Ourver L. Lacey, University of Alabama 


HE author of this book intends it “primarily as a text for .. . students who have 
had little or no previous calculus or statistics” and conceives of it as having two 
main purposes: (1) “to develop some basic mathematico-logical concepts of statis- 
tics, ... ” and (2) “to develop an understanding of the language used in mathemati- 
cal statistics including elementary calculus.” According to the preface, this book 
differs from others of a similar nature in two particulars: (1) in the introduction of 
the concepts of “sampling distribution, testing of hypotheses, confidence intervals, 
and power of a test without introducing other than finite populations,” and (2) in pre- 
senting within the text “the fundamental concepts of the calculus—limit, derivative 
and integral.” 
The book as a whole co ‘sists of a main text together with four appendixes. The 
chapter headings of the main text are as follows: (1) Finite Populations and Their 
Distributions; (2) Sampling from a Finite Population; (3) Statistical Inference; (4) 
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Parameters and Statistics; (5) Hypergeometric and Binomial Distribution; (6) Pois- 
son Distributions; (7) Discrete Distributions; (8) Continuous Distributions; (9) Nor- 
mal Distributions; (10) Chi Square; (11) “Student’s” ¢ Distributions; (12) Bivariate 
Distributions; (13) F Distributions and the Analysis of Variance; (14) Nonpara- 
metric Statistics. The appendixes are (A) Some Hints on How to Ask Questions of 
Mathematical Statisticians; (B) Mathematical Appendix (which includes a consider- 
able amount of introductory calculus together with several statistical proofs); (C) 
Miscellaneous Tables; (D) Tables of Sampling Distributions. 

My general opinion is that this book will probably be liked by many instructors 
but will prove difficult for most students unless given very considerable teaching 
support. A great deal of material is, in fact, covered in the 212 pages of the main text. 
It is covered, however, in a highly condensed style which may well appeal to the in- 
structor who is already familiar with the material but which is almost certain to be 
troublesome to students who are mathematically naive. I feel fairly confident that the 
author is over-optimistic in believing the text will serve well in an introductory 
course. Rather, it appears likely that it would be best used in a second course in sta- 
tistics or possibly as a first course for graduate students who have had a respectable 
amount of mathematics. In fairness I should say that the difficulty occasioned by the 
concise style is much alleviated by the examples and worked problems. These exer- 
cises are decidedly helpful in clarifying the abstract statements of the body of the 
text and further contribute to the student’s development of caution with respect to 
the fulfilment of the assumptions of the mathematical model by his actual experi- 
mental design. 

As mentioned above, the author feels it an advantage to discuss such concepts as 
those of sampling distributions, the power of a test, etc., as early as possible, using 
only finite populations. There are undoubtly advantages to this particular approach. 
However, it should be noted that something is also missed. Particularly does it seem 
true that the power of a test can more easily be grasped by meaus of a graphical ap- 
proach with a picture of a continuous frequency function to aid verbal discussion. 
I am also somewhat doubtful of the author’s position that the basic concepts of calcu- 
lus can be run in satisfactorily, on the side as it were, within the length of time avail- 
able for the ordinary course. The section in the main body of the text, consisting of 
some eight pages on the essentials of differentiation and integration, I feel is more 
likely to result in confusion than in clarification. This section is, of course, supported 
by fourteen small-print pages in Appendix B covering a number of basic concepts of 
the calculus. But to make this material meaningful to the student I am sure that the 
instructor will have to spend considerable time and provide a fair number of exercises 
from other sources. 

In a few places the text seems misleading, if not in error, on fairly important points. 

As a solution to the problem of discarding exceptional observations (pp. 150-1), 
Adams suggests the use of a formula derived from Cramér in which, in effect, the 
extreme observation is considered as one group and the rest of the observations as 
a second group, and the ordinary ¢ test computed. Adams does note that this test is 
essentially invalid since we have drawn specifically an extreme value and thus the 
tabulated probabilities will be too low. Just how much too low, however, is an un- 
answered question, and the test in this form would appear of little use. Rather, some 
test specifically of extreme values should be employed. The same comments apply 
with respect to the related discussion of the testing of the significance of the difference 
between the sub-sample mean and the general mean. The point should be emphasized 
that the techniques suggested are appropriate only where the sub-sample is chosen a 
priori. 
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On pp. 147-8, Adams discusses the problem of testing the hypothesis of equality 
of two means without assuming equality of variances. He suggests the discarding at 
random of members of the larger group until equality is reached in the numbers in 
the two groups, followed by random pairing of the remaining members of the two 
groups. This technique is clearly correct from the standpoint of a mathematical model 
but equally clearly suffers from the defect that accidents of the particular random 
discarding and pairing may yield widely varying computed probabilities from the 
same data. However, instead of pairing at random after discarding, it would seem ap- 
propriate to run the standard ¢ test, since it is known “that for samples of equal size 
there is not a serious likelihood of error in testing the difference of means as if the 
parent variances are equal.” (M. G. Kendall, The Advanced Theory of Statistics, Vol. 
II, 1951, p. 114.) 

On pp. 199-201 the author considers the problem of repeated measurements upon 
the same subjects under different experimental conditions. In general, his cautions 
concerning the appropriate mean square for error and the associated degres of free- 
dom are well taken. He considers explicitly, however, the interaction model, in which 
as a specific example, 2 subjects are tested under each of 20 experimental conditions. 
Here the number of degrees of freedom for error would appear to be 19 according to 
the model. This he rejects utterly, feeling that the appropriate number of degrees of 
freedom could never be greater than 1 since there are only 2 subjects. This point of 
view I also felt intuitively for several years to be compelling but after several sessions 
with mathematical statisticians, I have been convinced that it is not correct. If we 
are willing to believe that our mathematical model represents the actual experimental 
situation, then the appropriate number of degrees of freedom for error is precisely 
that indicated in the traditional analysis. Caution does indeed need to be exercised 
in drawing conclusions from such experiments but the caution stems basically from 
the dangers of nonrandomness in selection of the subjects rather than from a spurious 
inflation of the degrees of freedom. 

There are throughout the book, as is inevitable in a first edition, a number of minor 
errors and defects. On p. 156 the author considers bivariate and marginal distribu- 
tions. For the bivariate frequency distribution he uses the expression f(z,y). He then 
expresses his marginal distributions as f(x) and f(y) respectively although it would 
be better mathematical practice to use different symbols from f. On p. 168, Adams 
states that “this ratio (E[(yp—Y)*]/e,*) gives the correlation ratio eta (mys), which 
is defined as follows,” giving then the correct formula for eta. The first part of the 
statement makes it appear that he is defining 7 as simply E[(y,—Y)?]/o,? instead 
of one minus this quantity. The discussion of the partitioning of sums of squares on 
p. 192 makes it appear that variance analysis with a single criterion of classification 
demands equality of cases within each group. This could have been amplified to include 
unequal numbers of cases without undue difficulty. Throughout the text many of 
the references seem unduly general, and specific page references, especially with re- 
spect to Cramér, would be helpful. 

These critical statements are not intended to be destructive. Rather, it is my 
opinion, in summary, that this is a useful book for a secord course in statistics. It is 
a book which I believe will be liked by many instructors. Moreover the student who 
is helped by a good instructor will probably attain a fairly comprehensive introduc- 
tion to mathematical statistics and will find at the end of the course that he has a 
useful reference source for most of the common statistical methods. 





380 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1956 


Elementary Statistics. Benton J. Underwood, Carl P. Duncan, Janet A. Taylor, and John 
W. Cotton. New York: Appleton-Century-Crofts, Inc., 1954. Pp. ix, 239. $3.25. 


Lucite Derrick, University of Illinois, Chicago 


HE four authors have condensed into a very compact volume the materials they 

feel are necessary for a one-quarter course in statistics for psychology majors at 
the sophomore level. Included is a relatively restricted number of statistical concepts 
which the authors state they feel “are currently most widely employed in describing 
distributions and testing differences between sets of data.” 

Four chapters abridge the usual treatments of frequency distributions, measures 
of central tendency, variabilty, and correlation. The full chapter on percentiles reflects 
consideration for the special needs of the psychology student. In the chapters devoted 
to sampling error, significance of differences between means, and statistics in the de- 
sign of experiments, the authors have come closer than most treatises of this magni- 
tude to giving the student an awareness of the problems encountered in testing and 
experimentation. One would like to suggest, however, that the chapter on “Statistics 
and the Design of Experiments” become one of the early ones rather than Chap. 11, 
as it now stands. The student would then be in a better position to evaluate implied 
assumptions like “other things constant” in examples where sample results were 
compared. The inclusion of the two chapters on analysis of variance and chi square 
add much to the value of a book of this type. This in spite of the fact that the authors 
state that these chapters were intended for more advanced students or for a one- 
semester rather than their one-quarter course. Most of the common tables are listed 
in the appendix. 

There are a few instances where the authors have used terminology which might 
lead to confused thinking on the part of beginning students. For example: “Grouped 
frequency distributions” is used (p. 9) when “frequency distributions of grouped 
data” is meant. Students are asked to think of “i” as “the total number of units in 
the interval” (p. 37) which might imply to them that if continuous data were as- 
sumed, “i” would be infinite. “Since the distribution is symmetrical we may expect 
that three SD’s below M will approximately cover the distance” (p. 69). Here “nor- 
mal” is meant instead of “symmetrical.” “So that in effect 2 = N” (p. 75) which would 
hold only in special cases when a constant was summed. The expression “1.57z/o” 
is used for 1.57 =2/o¢ on pp. 92 and 95 although it is correctly shown on p. 94. When 
testing the significance of the difference between two sample means, the expression, 
“The differerce between the two population M’s is zero” is employed (p. 124). What 
is really being tested here is the characteristic “mechanical aptitude” as observed in 
two samples which may have come from the same or different populations. 

On the other hand, the inclusion by the authors of many examples, illustrations, 
and diagrams could be of much assistance in guiding step by step the thinking of the 
beginning student in the learning process. 


Statistical Methods for Social Scientists. Lillian Cohen. New York: Prentice-Hall, Inc., 
1954. Pp. x, 181. $5.35. 


Leo Karz, Michigan State University 


—_ remarks given in the preface of this new book suggest why the author and 
publisher felt justified in adding another to the abundance of such books. It is as- 
serted that the book is (t) “designed as an introduction to statistics for social scien- 
tists,” (ii) “comprehensive enough to give insight into the logic involved in statistical 
manipulation,” and (iii) “simple enough to be understood by anyone who has taken 
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an elementary algebra course.” Certainly, there is a need for such a book; this review 
is directed toward examination of how well this text satisfies the expressed desiderata. 

The didactic method adopted by the author consists in first presenting a specific 
social science problem and then developing the concepts and techniques as needed in 
solving the problem. The problems chosen for presentation are honest, fairly current 
problems in the social sciences. The data used are sometimes real, sometimes hypo- 
thetical. The net effect, however, is disappointingly small in view of the boldness of 
the approach. The reader is permitted to see the movement and action but not the 
strategy of the attack on a real problem. He may admire the skill of the protagonist, 
but he does not understand it. In this sense, the book fails in its first stated objective. 
It is no more an introduction to statistics than is viewing the television program, 
“The Medic,” an introduction to medicine. Both may serve to whet the appetite and 
arouse an interest which could lead to the study of a real introduction to the field. 
In this respect, indeed, Cohen’s book does a very fine job. 

The ability of the text to “give insight into the logic involved” is brought into sharp 
focus in the discussion of testing statistical hypotheses on pages 90-93. The discussion 
is limited to the Neyman-Pearson theory of testing hypotheses; as is well-known, this 
theory has an elegant and simple logical structure. This logic, however, is completely 
ignored. The reader is told that there are two kinds of errors and that, by shifting 
a level of significance, he may decrease risk of one at the expense of increasing risk 
of the other. He is advised to use tail regions for rejection solely on the grounds of 
precedent, since this is done “in most statistical studies in the social sciences.” The 
best thing in these four pages is the reference to the book of Dixon and Massey. 

The text material is simple enough to be understood by a reader familiar with ele- 
mentary algebra. This state of affairs is largely the result of the Procrustean device 
of deleting every statement which could not be understood readily by such a reader. 
Surgery as drastic as this can only butcher that which it is designed to simplify. 
Further, the author fits the stereotype of the surgeon in exhibiting an overly-strong 
tendency to cut. Even where it is possible to present material in a manner compatible 
with the assumed weak mathematical background of the reader, the author chooses 
the easy way of omitting the crucial part of the argument. In the treatment of “line 
of best fit” on p. 143, for example, there appears the bare statement that “when b is 
set equal to Dzy/E2%, and a is set equal to Y —bX, we have a line which passes through 
the scatter diagram in such a way that the sum of squares of deviations from the line 
are (sic) minimized.” A footnote refers students who know some calculus to Hoel’s 
book. The author makes no effort to indicate the well-known algebraic development 
based on the minimum value of an upward-opening parabola. 

The colloquial form of expression adopted by the author grated a little on the re- 
viewer; students may find it refreshing. The text is liberally sprinkled with good 
examples and exercises for the student. In an appendix, in addition to the usual 
tables, there appears a four-page summary of formulas occurring in the text, together 
with page references. 

The reviewer finds himself in complete agreement with the author that the kind of 
book this was to have been, perhaps employing the case-study approach of this book, 
would be a most valuable addition to the supply of textbooks of statistical methods. 
It would seem that the failure of the book to meet its objectives lies entirely in the 
execution; the plan is sound, and perhaps the present author, in a revised version, 
will be more successful. The author is to be commended for courageously breaking 
with tradition in attempting to do what seems not possible by traditional methods, the 
writing of an effective and sound introduction to statistical methods for mathemati- 
cally naive social scientists. 
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Statistical Methods for the Behavioral Sciences. Allen L. Edwards. New York: Rinehart 
and Company, Inc., 1954. Pp. xvii, 542. $6.50. 


Miron E. Terry, Bell Telephone Laboratories, Inc. 


Te review will commence by stating a necessary condition for an elementary text 
book to be acceptable when written without an assumption of college mathe- 
matics, as is this one. The material presented shall not only contain neither erroneous 
statements nor misleading sections or statements, but also shall not ignore methods 
and ideas essential to the areas treated in the text. 

In spite of the praiseworthy introduction of many nonparametric statistics and test 
procedures, Professor Edwards has failed to meet the above condition to such an 
extent that the text cannot be recommended either as a text or a reference book. The 
following is a selection of the failures which led to the reviewer’s opinion. 


(a) p. 145 “...; a correlation of 0 indicates no relationship whatsoever between 
the two variables; and a correlation coefficient of —1.00 indicates a perfect 
negative relationship.” 

(b) The theory of linear regression is developed without the assumption of homesce- 
dasticity; in fact, testing the hypothesis that the population regression coef- 
ficient has a specified value is performed by means of a ¢ test without the as- 
sumption of normality, independence or homoscedasticity of the residuals. The 
assumption of normality is not discussed in either the distribution of the sample 
correlation coefficient of or the ¢ statistic, although when the two sample non- 
parametric tests are introduced the assumption is belatedly introduced. 

The development of random errors of measurement is completely from the 
large sample viewpoint anc. follows very closely the development of H. Gulik- 
sen, Theory of Mental Tests (New York: Wiley, 1950). In the light of modern 
statistics it is quite an inadequate development. 

Analysis of variance is generally restricted to the cases of a one-way classifica- 
tion and the two factor factorial design. The rejection of F values less than 1 
as yielding no information about the null hypothesis, the use of ¢ tests on group 
means after F tests have been applied and the general treatment of interaction 
mean squares leave much to be desired. 

Confidence intervals are not discussed even though most of the scientific 
literature of this country uses them to the exclusion of the fiducial interval. 
The fiducial limits, intervals and probabilities are defined, but in the examples 
given to clarify these concepts an unusual form of statement is made. The 
author rejects this (null) hypothesis at the “5 per cent level of confidence” 
when the statistic |z] exceeds 1.96. It would appear that significance levels, 
confidence coefficients, and fiducial probabilities have been admixed. 


Finally, it seems to this reviewer that one of the main research tools of the be- 
havioral scientist is sampling with all its ramifications and difficulties including the 
problems of editing of data, non response, and stratification. This text omits any 
discussion or reference to sampling. 

The acceptability of a text is conditioned by the availability and quality of com- 
peting texts. Since good elementary texts already exist for students of the behavioral 
sciences, this text cannot be justified solely on the grounds of need. 
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An Introduction to Stochastic Processes with Special Reference to Methods and 
tions. M. S. Bertlett. New York: Cambridge University Press, 1955. Pp. xiv, 312; ‘50. 


Leonarp J. Savacn, University of Chicago 


VERY statistician knows something about stochastic processes, though like me he 

may be late to learn, and never entirely comfortable with, that awesome sounding 
name. For this review, it will be enough to say that a stochastic process is the distri- 
bution of which an individual time series (discrete or continuous) is the sample point. 
The time series of economics, meteorology, quality control, and epidemiology are 
especially familiar to statisticians, largely because they give rise to important, and 
usually harrowing, problems in statistical inference. The Brownian motion of a parti- 
cle presents time series of great interest in physics, but the inference problems for these 
and related series are usually unchallenging, because data for them is so abundant 
that efficiency in drawing the inferences is unimportant. 

Attempts to produce mathematically formal definitions of the stochastic process 
concept have failed of their direct object. What they have done is to reveal that there 
is no sharp demarcation between the study of stochastic processes and the rest of the 
theory of probability—even time proves only incidental to the idea. There is, rather, 
a vague, informal demarcation in terms of several ideas and techniques, no one of 
which is essential. 

Some of the oldest problems of probability, such as that of the gambler’s ruin, 
concern stochastic processes. Within statistical inference, stochastic processes are 
also old, for where is the beginning of the tarnished history of period hunting and 
forecasting? But during the past two decades, interest in stochastic processes has 
been extraordinarily intense and there have been many—bewilderingly many—ad- 
vances in their theory and application brought about by remarkably effective, though 
not always conscious or even willing, collaboration of workers from many fields. 
Mathematics, engineering, physics, astronomy, meteorology, economics, and micro- 
biology are among the important contributing fields. 

In this period of rapid expansion the field of stochastic processes naturally seems 
interesting and attractive but threateningly difficult to many people outside of it, 
so that expository treatments are eagerly sought and awaited. The book under review 
is, as its title indicates, just such an exposition, and it is entirely without precedent 
in completeness and variety. Today, anyone who wants to learn the subject (unless 
possibly his interest in it is exclusively that of the pure mathematician) will have to 
turn to this book both early and late in his study. Anyone who has a specific problem 
in the application of stochastic processes will turn to it fer specific ideas and 
references. 

As I have already implied, the coverage of the book, except for certain mathemati- 
cal areas at which it is not at all directed, is remarkable. The difficult problem of 
organization is met with thought and flexibility. High accuracy is maintained. In two 
respects, though, I am disappointed in the book. First, I had hoped to find more 
emphasis on the qualitative behavior of stochastic processes; and, second, the 
writing seems trying and difficult. 

Let me illustrate my meaning about qualitative behavior by one inclusion and one 
omission. The number of molecules of gas in a spherical volume of radius a is in con- 
stant fluctuation, and in time it will surely fluctuate by 1% from its mean, or typical, 
value. Smoluchowski showed that the expected time between such fluctuations de- 
pends with fantastic sensitivity on the radius a. Bartlett mentions, in illustration of 
this point, that the expected times under ordinary conditions are roughly 10" sec., 
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10° sec., and 10®* sec. for a equal 100, 300, and 500 millimicrons, respectively. This, 
of course, gives an insight that a mere glance at the formula from which these num- 
bers are readily computed would not. It seems to me that Bartlett has included much 
too little of this sort of information. It is, for example, disappointing, in reading about 
the problem of gambler’s ruin, to find no allusion to such colorful ideas as the follow- 
ing. A gambler wants to maximize his probability of winning a certain target sum 
before losing his present fortune. He is playing ap “unfair” betting game, that is, at 
each play he chooses a stake X to risk and always has the probability p of recovering 
his stake together with winnings aX, where (a2+1)p <1. It is disastrous for such a 
gambler in the name of caution to risk a large number of small stakes. In fact, he 
should boldly stake at each play the largest sum then at his disposal that cannot re- 
sult in his total winnings’ exceeding the target sum. 

My general difficulty in reading the book is probably closely dependent on the 
paucity of qualitative material, for much space is occupied by formal algorithms, 
which are often of great value when properly put to work but extremely dry on first 
encounter. The following quotation typifies what I mean: 


The normal or Gaussian distribution for a vector variable X may be most simply de- 
fined in terms of its cumulant function, which is 


K(0) = 0’m + 40’Vo. (12) 


Its distribution function, which may be obtained from C(g) by inversion, is equivalent 
to a density function 


f(x) = (2x)-”™| V | -”2 exp { —}(x — m)’V-(x — m)}, (13) 


where |V| denotes the determinant of V. Any linear transformation on X still yields 
a normal! distribution function. 


It would clearly be unreasonable to expect anyone to learn about the normal distri- 
bution from that alone. In fairness to Bartlett, almost no one has any business read- 
ing this book until he knows a little about the normal distribution from other sources. 
But the same abruptness and formality do occur often in other parts of the book 
where they present serious obstacles to any reader and especially to one who is not 
primarily a mathematician—one sort of reader for whom the book is especially im- 
portant. The difficulty can be expressed by saying that the book is extremely mathe- 
matical in one sense, and this took me unawares in view of the title and the announce- 
ment in the preface that “A much more elementary discussion of mathematical meth- 
ods and statistical techniques, addressed to the applied mathematician and statis- 
tician, would be attempted in the present introductory work.” In that attempt, it 
seems to me that Bartlett has been only partly successful, failing somewhat in two 
different ways. He has shielded his reader from many delicate arguments productive 
of rigor but not of answers that the “applied mathematician and statistician” will be 
glad to leave to specialists, at least until a later day, but he has also, I feel, left out 
too many instructive and revealing arguments; and yet the book remains highly 
mathematical, too much so for many who are quite competent statisticians. Diffi- 
culty is a most subjective concept; one review I have come across particularly com- 
mends this book for its clarity and readability. 

It seems futile to try to reflect the table of contents here in any detail. There is 
something about aimost every immediately applicable topic and even about most 
applications. There is nothing about martingales, which is the topic of high mathe- 
matical fashion just now. There is a short but rich chapter (Chapter 7) on two topics 
from communication engineering, linear prediction and information, which will be 
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especially welcome because they are high fashion in applied mathematics. Readers of 
this journal will be especially interested in Chapters 8 and 9, which are devoted to 
statistical inference. 

The bibliography is very valuable. The glossary and index seem a little too brief. 


Statistics in Research. Bernard Ostle. Ames, Iowa: Iowa State College Press, 1954. Pp 
xiv, 487. $6.95. 


ALLAN BrrnpauM, Columbia University 


HE intention of the author is “(i) to provide a book giving the principal statistical 

methods of use to workers in all areas of scientific research, and (ii) to provide a 
book designed to facilitate the teaching of the science of statistics.” The author has 
taught courses based on the contents of the book at Iowa State College and Montana 
State College. No mathematical training is presupposed. . 

The first chapter briefly puts statistical techniques in perspective in relation to 
science, logic, and research as a whole, without entering pon technical and philo- 
sophical issues. The last chapter, “Design of Experimental Investigations,” makes 
clear that the techniques described in the body of the book can become genuinely 
useful only within the context of suitably designed experiments. Here the funda- 
mental role of randomization over experimental units is stressed without denying 
the usefulness of systematic elements in designs. The complexities of the physical 
meaning of “experimental error” are indicated briefly. Chapter 2 introduces and 
illustrates the basic statistical concepts of frequency distribution, population, sample, 
parameter, interval and point estimate, test, Type I and Type II errors. A mathe- 
matical appendix adds descriptive accounts of some important sampling distributions 
and their interrelations, with formulas, graphs, and explanations of the use of the 
ample statistical tables provided. Chapter 3 gives detailed illustrations of the dis- 
tributions of sample means, variances, ¢ and F statistics, utilizing empirical sampling 
data extensively. Chapters 3-14 include confidence intervals and tests for attribute 
data, estimates and tests on means and variances of one or two normal populations, 
regression with one or several independent variables, analysis of variance and co- 
variance for many experimental designs including factorials and lattices, Models I 
and JI and mixed models for analysis of variance. A large collection of problems is 
included, without answers. Some recently developed techniques which are included 
are Scheffé’s method for simultaneous estimation of all contrasts and Bartlett’s 
method of fitting a line where both variables are subject to errors. Detailed advice 
is given on the determination of sample sizes for many techniques and on repeated 
significance tests and pooling. 

Some regrettable lapses from the generally high level of exposition and arrangement 
may be noted, although these do not seriously detract from the value of the book as 
a whole. The usefulness of the standard statistics for comparing means when the 
normality assumption fails is not duly stressed, although it is discussed in one see- 
tion; this may lead some readers to undue conservatism in the use of some standard 
techniques. Binomial confidence intervals are introduced or: p. 55 without a reference 
to p. 81 where the properties of confidence intervals are really explained for the first 
time. The first detailed explanation of a significance test occurs in a very special 
context—triangular (taste) tests—and is marred by an inaccurate expository state- 
ment; an early description of a test suggests that it may be a three-decision rather 
than a two-decision procedure (p. 27); excellent interpretations of significance test 
procedures appear later. The need sometimes to combine classes in a table of attribute 
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data is illustrated, but the reader will have to consult another reference to find defi- 
nite directions on when and how this should be done. Further editing could have 
made the book even more useful. 

This book should prove of definite value to a large group of research workers in 
affording a very readable and considerably detailed introduction to statistical con- 
cents and techniques and to the practice of their application in research. The level 
of precision and clarity of exposition is or the whole very high. The writing conveys 
an attitude much needed by the intended reader, an attitude of well-balanced appre- 
ciation for the mathematical models underlying statistical techniques on the one 
hand, and the realities of experimental situations on the other. Temptations to over- 
simplify technical and methodological issues are commendably avoided. The careful 
reader will be well oriented and guided toward reasonable use of the techniques pre- 
sented and toward further references when required. 


Experimental Design and Its Statistical Basis. D. J. Finney. Chicago: The University of 
Chicago Press, 1955. Pp. xi, 169. $4.50. 


W. 8. Connor, National Bureau of Standards 


y igre little book, for workers in biology and medicine, aims to develop appreciation 
of the design and analysis of experiments. It is not a textbook, replete with cur- 
rent factual knowledge, but a rapid survey concerned with introductory concepts. 

There is much common sense in this book. In comparing a new treatment with a 
control, we are cautioned that “observations on control and treated subjects should 
be made contemporaneously, lest conclusions be biased by changes in conditions irrele- 
vant to the general operation of the new treatment” (p. 23). Concerning randomiza- 
tion, we are warned that “characteristics of the subject (such as age, sex, previous 
history) or of the severity of the disease must not affect the allocation to treatment. 
An objective rule for allocation is essential. The only safeguard is randomization. 
Whether or not the subject receives the treatment must be decided by the fall of a 
well-balanced coin, the drawing of lots, or some similar random process” (p. 23). 
Concerning replication, we are advised that “whatever the units to which treatments 
are to be applied, two or more plots must be allocated to each treatment, in order that 
account may be taken of individual variations between units treated alike. For if 
only one rat (were) allocated to each of two treatments, there would (be) no way of 
judging whether an observed difference was the effect of treatment or was entirely 
due to chance” (p. 47). (Change of tense by the reviewer.) 

Such clear English writing seems likely to get the ideas across and to promote 
goodwill towards statisticians. Unfortunately, such passages are interrupted by dis- 
courses on methods of analysis, which are complicated and cannot be adequately 
treated in the space available. It is this reviewer’s sad, but inescapable, opinion that 
many readers will not become aware of much good advice contained in the later 
chapters, because before reaching it, they will have expired with an anguished cry, 
mired in the intricacies of x? or the F ratio! 

To illustrate the difficulties encountered by the author in discussing the analysis 
of data, we may look in on page 14, where x? is introduced: “If theory states that a 
fraction of observations ought, on an average, to fall into one of two classes, and of 
a set of n independent trials the proportion in this class is p, then the quantity x’ 
(“chi-squared”), defined by 

n(p — P)* 


Sp cecetiterna 
x" Pa -P) 


can be used to approximate to the test of significance of the deviation of p from the 
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theoretical value.” It is then explained how to carry out the test, but no mention is 
made that this is x? with one degree of freedom, perhaps because the concept of 
degrees of freedom is not introduced until page 35. 

On page 16, an effort is made to help the reader appreciate this x* formula: “Now 
inspection of the formula for x? indicates that P(i —P)/n is a measure of the extent to 
which p is likely to vary about P in a progeny of size n; this is apparent because the 
probability associated with any particular value of (p —P)* is dependent only on 
(p —P)*+P(1 —P)/n, so that the divisor scales down any squared deviation (p —P)? 
in such a way as to eliminate the influence of P and n on its probability.” Surely this 
explanation of the variance of p must seem as incomprehensible to the uninitiated 
as it seems bizarre to the statistician! 

In his discussions of experimental plans, the author is more successful. He provides 
an interesting survey of a wide variety of experimental designs, including randomized 
blocks, Latin squares, incomplete blocks, and factorials; and happily, the various 
arrangements are illustrated by real data. In the chapter on sequential experimenta- 
tion, we are reminded that the currently acclaimed Box and Wilson procedure is 
only one among several techniques which may properly be called sequential. Finally, 
there is a chapter devoted to biological assay. 

In summary, this little book errs in the usual way in which little bocks err. It tries 
to do too much. It also errs in the way in which both little and big books err when 
they attempt to promote the design of experiments. It confuses the design of experi- 
ments with complicated statistical techniques of analysis, automatically insuring a 
smail readership. 

Despite these shortcomings it is to be hoped that the research worker in biology 
and medicine will ferret out the many useful sections on planning experiments. 
They will reward his efforts. 


Principles and Practice of Field Experimentation. Second Edition. John Wishart and H.G. 
‘anders. Cambridge, England: W. Heffer and Sons Ltd., 1955. Technical Communication 
18, Commonweaith Bureau of Plant Breeding and Genetics. Pp. vii, 133. 21s. 


G. A. Baxer, University of California, Davis 


~— authors have collaborated to write an excellent manual on field experimenta- 
tion. Wishart wrote the first part which presents practically, and with no mathe- 
matical proofs, the statistical methods commonly needed in the reduction of agricul- 
tural field trial data for the common experimental designs with worked numerical 
examples. 

In the second part by Sanders advice is given on experimental policy, on the agri- 
cultural significance of experimental results, and on many practical points in experi- 
mental lay-outs and the taking of observations from plots. The two parts cover the 
syllabuses on this subject in the Agricultural Colleges of the Commonwealth. 

The original edition was published in 1935 based on the 1926 article entitled “The 
Principles and Practice of Yield Trials” by F. L. Engledow and G. Udny Yule pub- 
lished in The Empire Cotton Growing Review (Vol. III, Nos. 2 and 3) and the work of 
R. A. Fisher. The second edition brings the subject partially up to date. 

Tests of differences between means are based on least significance differences and 
the idea of range if many means are to be compared. The treatment is very sketchy. 

On page 23 in discussing a dummy trial imposed on a uniformity trial it is implied 
that significant differences should not be found. Of course, even if everything were 
in conformity with the assumed mathematical model (all distributions normal with 
the same variance), “significance” would be found with the frequency indicated by 
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the type I-error level. The discussion in pt. II, p. 86, is more realistic in that Sanders 
recognizes that unremoved soil fertility differences may grossly distort the frequen- 
cies of both type I and type IJ errors in analyses of variance of yield trials. 

Sanders on p. 83 takes a much more restricted view of the role of “sfatistics” than 
does L. J. Savage, for example. Sanders regards statistics as a too) to answer detailed 
specific questions with a definite probability of beiag wrong. Savage on the other 
hand views statistics as the discipline of rational decision in the face of uncertainty. 


Theory of Games and Statistical Decisions. D. Blackwell and M. A. Girschick, New York: 
John Wiley and Sons; London: Chapman and Hall, 1954. Pp. xi, 355. $7.50. 


I. J. Goon, Cheltenham, England 


EFORE entering into the review itself it may be observed that the participation of 

H. Rubin and P. Suppes is acknowledged as amounting “practically to co- 
authorship.” In view of the other acknowledgements in the Preface, and of the numer- 
ous contributions of the authors themselves to the subject, this book may be regarded 
as authoritative, or even as the 1954 definition of “decision theory.” 

The book is dedicated to A. Wald, and follows in his tradition of making much use 
of utilities in statistics, of relating statistical procedures to the theory of games, and 
of emphasizing sequential procedures. The style also is like Wald’s in avoiding verbal 
philosophy, history, polemies and judicious repetition. The result is cold and austere, 
but it is nearly always unambiguous. It can hardly be misunderstood, but it will also 
not be understood except by the careful and patient reader. For those who are 
already fairly familiar with the subject matter it will make a very useful reference 
book. For this purpose its value would be increased if it had an index of symbols, 
since the symbol-to-word ratio is high. 

It is never stated whether any result is new. The 180 numbered references are 
given at the end of the book, with lists of the numbers of those that are related to 
each chapter. This highly impersonal style is not to the reviewer’s taste. A more 
personal and historical style may occasionally lead to bickering, but it keeps up the 
human interest. Owing to the mechanical perfection of the style the reader may 
occasionally feel like an electronic computer that is having a program loaded into it. 

Only 14 of the references are dated pre-1945, showing how modern the subject is 
in its detailed development. The only direct reference to Fisher is to one of his non- 
statistical papers! 

A fair idea of the coverage and emphasis may be obtained from the following list 
of chapter headings: (1) games in normal form, (2) values and optimal strategies in 
games, (3) general structure of statistical games, (4) utility and principles of choice, 
(5) classes of optimal strategies, (6) fixed sample-size games with finite Q [i.e there 
are only a finite number of simple statistical hypotheses (states of nature), w], (7) 
fixed sample-size games with finite A [A is the class of possible actions for the statis- 
tician], (8) sufficient statistics and the invariance principle in statistical games, (9) 
sequential games, (10) Bayes and minimax sequential procedures when both 2 and A 
are finite, (11) estimation, (12) comparison of experiments. 

The book begins with a clear account of the Borel-von Neumann-Morgenstern 
concept of a game, giving, for example, the definition of a minimax strategy, together 
with some fundamental theorems about zero-sum two-person games. The theory of 
games is then related to statistics by regarding the two players as the statistician 
and nature. It is admitted however that nature cannot be regarded as an intelligent 
opponent, and the rational way of playing statistical games is therefore described as 
an “unsolved problem.” 
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In the discussion of utility it is assumed that “the decision maker has formulated 
his objectives with sufficient clarity to declare which of two outcomes he prefers (or 
that he is indifferent between them), and, more generally, for any two probability 
distributions ..., to declare which he prefers.” (If the statistician knows his own 
mind so well we may ask why he cannot also estimate his degrees of belief concerning 
the various possible states of nature. If }e could do this he could in principle solve 
all his problems unambiguously by maximizing expected utility.) The relationship 
between games against an intelligent opponent, statistical games, and the general 
decision problem is described in the following illuminating manner. “In the first case, 
w is chosen by an opponent whose utility is the negative of the decision maker’s; the 
statistical game is the case in which w is regarded as being determined by nature. 
In the general decision problem w may for instance be partly chosen by nature and 
partly by various individuals whose utilities may be in any relationship to those of the 
decision maker.” 

There is some discussion of the advantages and disadvantages of the original mini- 
max principle and of its modification (the Wald-Savage minimax loss or minimax 
regret principle). The type II minimax principle (minimax defect) is not discussed, 
presumably because its consequences have not yet been worked out in any detail. 
(See Journal of the Royal Statistical Society, Series B, 14 (1952), 107-14; and loc. cit., 
vote of thanks in a symposium on linear programming, forthcoming.) It is proved 
that if the statistician can satisfy a decision principle satisfying a few natural de- 
siderata, then he will behave as if he accepted a certain prior (=initial) distribution 
of probabilities of the available simple statistical hypotheses. 

It is claimed (p. 127) that “the theory that served to delineate optimal strategies 
in games played against an intelligent opponent serves to delineate classes of optimal 
strategies in games played against nature.” 

While general theorems are the main preoccupation there are several applications 
to classical statistics, in each of which a precise assumption is made concerning utili- 
ties. If the orthodox (Fisherian) statistician says that utilities are not usually relevant 
to statistical practice he should consider the following examples. 

(i) On pages 229-33 the problem of random sampling from a finite population is 
considered. Our statistician wishes to sample N individuals out of a population of M 
individuals, itself a sample taken from a larger population by nature or by another 
“player.” The cost of sampling, to our statistician, is the same for all (M/N) possible 
samples and the expected loss for any act depends only on this act and on the M 
individuals, not on their order. (For the definition of “loss” see the next example.) 
Then it is shown that certain general principles, called the principles of “invariance” 
and “sufficiency,” lead to the well-known method of random sampling, i.e., to giving 
each of the (M/N) an equal chance of being used. The assumptions made here would 
probably be in the preconscious minds of most practicing statisticians. If one of the 
functions of philosophy is to bring preconscious thoughts into consciousness, then 
this book is philosophical. 

(ii) Suppose we wish to make a point estimate, #, of the chance, p in a binomial 
distribution; sample size=n, number of successes=r. In other words we are going 
to say that p is (approximately) equal to /, for which statement we receive expected 
utility K(p) —L(p, $), where L(p, $) is the (expected) loss (or regret) and is positive 
when / ¥p and vanishes when /=p. It is shown (p. 167) that the (Wald-Savage) 
minimax strategy leads to the familiar estimate =r/n in case L(p, p)=(p—$)* 
/p(1 —p). If, in a given application, we do not accept this loss function we probably 
ought not to use the estimate r/n. When the loss function is (p — f)* we get the mini- 
max estimate (p. 170) 
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- r+ 4/n 
P n+/n 


For large samples these estimates seem reasonable enough. But if n were small and 
we were sampling heads and tails by spinning a freshly minted coin, neither of these 
estimates would be reasonable, because of our judgment about the initial probability 
distribution of p (sharply humped near p=}). To this objection it may be replied 
that the loss to the statistician’s reputation would be so great if he estimated say 
p=%, on getting two heads and one tail, that the above loss functions would not be 
realistic. In this way it may sometimes be possible to push the judgment of the initial 
distribution into the loss function, but it would be rather a swindle. 

How relevant is decision theory to statistics? Two things seem clear. In the first 
place judgments of utilities are more precise in some industrial and military applica- 
tions of statistics than in applications to pure science. (By “precise” we mean judged 
to lie in a narrow interval.) In purely scientific applications the judgment of initial 
probabilities of hypotheses wi'l often be more precise than their utilities. For such 
applications we could define “admissible strategies of the second kind” as strategies 
for which there is no strategy having more expected utility, when the initial proba- 
bilities are assigned and the utilities are allowed to range over sets (instead of the 
other way round). Maybe admissible strategies of the third kind should also be con- 
sidered, with the initial probabilities and the utilities both ranging over sets. 

The heart of statistics, for ordinary scientific applications, is still to be found in 
such books as Yule and Kendall’s Introduction to the Theory of Statistics (London: 
Charles Griffin, 1950) and The Design and Analysis of Industrial Experiments, edited 
by O. L. Davies (London: Oliver and Boyd, 1954). If, at random, we sample one 
index entry from the latter book and one from the book under review, the chance 
that they will coincide is about 1/70,000. For the statistician working in pure science, 
decision theory is a part of the philosophy or foundations of statistics, just as formal 
logic is a part of the philosophy or foundations of mathematics. Decision theory is 
not a subject that can be appreciated in all its austere details by a statistician with 
less than one or two years of experience of real life. A little goes a long way, like 
philosophy for the mathematician, mathematics for the practical physicist, physics 
for the engineer, or engineering for the businessman. The best philosophers are often 
mathematicians, but a little philosophy can be of more practical importance than a 
lot of mathematics. 


(n > 0). 


Handbook of Probability and Statistics with Tables. R. S. Burington and D. C. May. 
Sandusky, Ohio: Handbook Publishers, Inc., 1953. Pio. ix, 332. $4.50. 


D. Tercurorew, The National Cash Register Company 


e authors state that, “[This book] brings together information which is not 
otherwise readily available in simple form except by reference to numerous jour- 
nals, tables and treatises on the subject.” This statement is a slight exaggeration, 
since a combination of one or two of the standard textbooks and a book of tables 
(such as Fisher and Yates, or Hald) would contain most of the material; however, 
anyone who wants to make a minimum investment in a library on statistics and 
probability will find this handbook useful for practical problems. 

The book contains 240 pages on statistical theory, 63 pages of tables, 2 pages of 
references, 1 page of index of names, 3 pages for an index of symbols (giving the 
pages on which they are used) and 14 pages of an index of subjects. The index is 
remarkably complete and materially improves the usefulness of the handbook. 
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The subjects covered, or mentioned, in this book are: an introduction to statistical 
terms, moments and other measures, one-dimensional frequency distributions, 
combinations and permutations, elementary probability theory, one-dimensional 
probability distributions, generating and characteristic functions, binomial, normal, 
and Poisson distributions, multi-dimensional probability distributions, regression 
theory and time series, sampling distributions, statistical inference, significance tests 
and confidence intervals, analysis of variance, sequential analysis, sampling inspec- 
tion, quality control, and finite differences and interpolation. This appears to be a 
reasonable choice of subjects and should be sufficient for most occasional users of 
statistics. The grouping of information is sometimes hard to understand, e.g., the 
Beta and Gamma distributions are discussed under Sampling Distributions rather 
than in the chapter on Probability Distributions; however, the index appears to be 
sufficient to locate such material. 

The section on tables contains values for the following distributions: binomial, 
incomplete Beta, Poisson, normal F, z, t, x*; miscellaneous tables of Stirling’s numbers, 
binomial coefficients, ./npq and +/pq, ¢~*, factorials and the Gamma function and 
their logarithms, reciprocals, squares, square roots and their reciprocals, trigono- 
metric functions, natural and common logarithms; a table of integrals; and a nomo- 
gram for 1 —(1 —p)*. In addition, there are 15 tables in the text. Of these, 3 are con- 
cerned with the normal distribution, 4 with the bivariate normal, and 2 give the con- 
fidence interval for proportion of successes for coefficients .95 and .99. The following 
subjects are each covered by one table each: the trivariate normal, the relation of 
sample standard deviation to the population standard deviation, the r-distribution, 
the distribution of the range, the probability that a fraction A of a population lies 
within the range of the sample, and numerical coefficients for estimating central lines 
and control limits. 

In the preface, the authors state that their objective was to provide a handbook 
that readers, without detailed statistical knowledge, could use as a guide and those 
with statistical training could use as a convenient summary. The first part of the 
objective has been achieved by including more introductory material than is usually 
in handbooks. The major limitation, as far as the second part of the objective is 
concerned, lies in the fact that, except in a few cases, the reader is given no help in 
locating further information on a subject. Undoubtedly, additional references were 
omitted because of space limitations. However, by giving references with a code using 
a numbering system for the books listed on the two pages of references and for a dozen 
or so of the statistical journals, the value of the book could have been improved im- 
measurably with very little increase in space. 


Operations Research for Management. J. F. McCloskey and F. N. Trefethen, Editors. Bal- 
timore: The Johns Hopkins Press, 1954. Pp. xxiv, 409. $7.50. 


Samuet Karun, California Institute of Technology 
pfs science of operation research is basically concerned with the task of analyzing 


various courses of action as a means of arriving at optimal decisions. 

The decision problem must be clearly defined. More explicitly the objectives of 
the various participants, the alternatives of action, their consequences, and the na- 
ture of all the random effects and disturbances must be ascertained. From here on the 
theory attempts to examine and evaluate the various courses of action in terms of the 
objectives. The areas of applications are vast and include economics, military logistics, 
business, and industrial engineering. 

This book can be viewed as an introductory survey of the methodology and philoso- 
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phy of decision making as applied to management science. This work is a collection 
of articles written by experts in various areas of operation research. 

The presentation is divided into three main parts. Part one highlights some of the 
key developments in the history of the theory. A discussion of the relationship of the 
operation researcher and management is given. Also, in this part an attempt is 
made to define the scope of the field of operation research. However, the real justifi- 
catioa of the theory is found in the content of parts two and three of this volume. 
Part two is concerned with the methodology of the theory. Representative topics 
treated include surveys of queueing theory, information theory, linear programming, 
game theory, and a chapter devoted to computing machine methods. The chapter on 
queueing theory and game theory must be particularly commended since the power 
of more advanced mathematical ideas and techniques is described as applied to traflic- 
flow problems and military tactical games respectively. On the other hand the uses 
of statistical ideas in decision making is presented in its simplest form. Many of the 
advancements of statistical decision theory are not even mentioned. Specifically, 
some noteworthy omissions of methodology not discussed are, for example, (1) stand- 
ard statistical procedures of testing hypothesis and estimating unknown parameters, 
(2) dynamic or sequential methods related to multistage decision procedures. 

Part three of this volume is devoted to a study of a number of case histories. 
Included are case studies of operations research as it applies to the printing industry, 
sales promotion, heavy industry, and agriculture. 

As the uses and needs of mathematical methods in the social and management 
science increases it is of great value to attract young scientists to deal with the inter- 
esting and challenging problems that emerge in these domains. This introduction to 
this new science serves such a purpose well. 


Government Statistics for Business Use. Second Ed. Philip M. Hauser and William R. 
Leonard, Editors. New York: John Wiley and Sons, Inc., 1956. Pp. xvii, 440. $8.50. 


HIs is a second edition of the volume reviewed in this Journal by William A. Spurr, 
Vol. 41 (1946), pp. 610-2. 

The present edition, according to the Preface, “brings up to date a description of 
the federal statistical system, taking into account the changes that have occurred 
since 1946. Thirteen of the fourteen chapters in the first edition have been revised. 
One of the chapters in the first edition—‘Accounting Statistics’—has been dropped, 
and two new ones have been added—Chapter 14, ‘International Statistics,’ and 
Chapter 15, ‘Some Uses of Sampling and Sampling Aids.’ In addition to describing 
the more important types of statistics currently available throughout the govern- 
ment, we also include descriptions of the basic data available in the more recent 
censuses and of the kinds of data that may be expected from the 1955 economic 
censuses, including the quinquennial Census of Agriculture. As this volume goes to 
press, a few decisions concerning the economic censuses have yet to be taken although 
it is unlikely that significant modifications of the plans described in this book will 
occur. Since threse economic censuses will not be repeated for at least four or five 
years and the current statistical programs of most of the federal agencies have under- 
gone their major postwar changes, this revision should remain a useful guide to 
government statistics for some years to come.” 

The fifteen chapters and their authors are: (1) Introduction, Hauser and Leonard; 
(2) National Income and Other Business Indicators, Milton Gilbert; (3) Manufac- 
turing, Maxwell R. Conklin; (4) Mineral Statistics, Y. 8. Leong; (5) Agriculture, 
Conrad Taeuber and J. Richard Grant; (6) Retail, Service, and Wholesale Trades, 
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Howard Grieves, Harvey Kailin, and Rexford C. Parmelee; (7) Foreign Trade Statis- 
ties, J. Edward Ely; (8) Transportation and Other Public Utilities, Frank L. Barton; 
(9) Money, Credit, and Finance, Edward T. Crowder; (10) Prices, Lester 8. Kellogg; 
(11) Housing and Construction, Paul F. Krueger; (12) Population, Hauser; (13) 
Labor, Charles D. Stewart; (14) International Statistics, P. J. Loftus; (15) Some 
Uses of Sampling and Sampling Aids, Joseph Steinberg and Morris H. Hansen. 
W.A.W. 


Measuring Business Changes. Richard M. Snyder. New York: John Wiley & Sons, Inc., 
1955. Pp. xvii, 382. $7.95. 


Maurice W. Len, State College of Washington 


thes fact that a volume such as this one has seemed necessary is in itself perhaps an 
encouraging commentary. The outpouring of quantitative information has 
threatened the engulfment of the users of statistical series. Even the expert must at 
times feel adrift in a maze of data. For the less-than-expert user of such information 
the course is indeed a confusing one. 

This volume follows somewhat in the pattern of Arthur H. Cole’s Measures of 
Business Change, a conclusion with which Mr. Snyder would presumably agree since 
he comments concerning the Cole book, saying: “Our purpose in this volume is to 
provide a more detailed description of the nature and composition, as well as the uses 
and limitations, of a selected list of the more important business indicators.” This 
the author does in satisfactory fashion. Snyder’s book should prove a useful tool. 

There seems little to criticize in the organization of this book. After a short intro- 
ductory chapter the author moves directly into his discussion of various statistical 
series. This review may provide some service to its readers if the general headings 
of these main sections are noted. They are, with their lengths in pages: 1, National 
Income and Product Series (28); 2, Population (9); 3, Labor: Employment and Earn- 
ings (22); 4, Commodity Prices (71); 5, Production and Business Activity (50); 6, 
Construction Activity and Costs (37); 7, Trade (28); 8, Financial Activity (40); 9, 
Stock Prices (30). 

For the most part the author’s treatment of the various statistical series which he 
presents cannot be regarded as analytical. It is essentially descriptive. Sources of 
current and past data are suggested and a fairly comprehensive index should prove 
helpful to the investigator in search of a particular series. 

This reviewer found it difficult to decide for whom this book was written. The 
author has indicated that it was prepared for businessmen. Undoubtedly a good many 
businessmen could profit from reading this book; most will have little inclination to 
to do so. It is to be hoped that they will authorize the expenditure of the necessary 
sum to place this volume on the desks of their staff assistants who have had|some 
training in statistical-economic analysis. It may well find a place on the — 
of such people. 

This reviewer would like to express one small carping comment. The title of this 
book Measuring Business Changes is somewhat misleading. It conveys a concept of 
analytical treatment which the book was not designed to accomplish. Perhaps the 
copyright on Cole’s book, Measures of Business Change, estopped the use of a similar 
title for this volume. It would have been more appropriate. 

Any book of this sort must suffer from the inherent nature of its subject matter. 
Series are constantly being revised, withdrawn, or initiated, and a book of this sort 
must always be revised or lag behind the fact. There are already some series described 
by Snyder which have undergone such change. Put, on balance, the book must be 
regarded as a useful “Baedeker” for the analyst. 








394 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1956 


Long-Range Economit:: Projection: Studies in Income and Wealth, Volume 16. The Con- 
ference on Research in Income and Wealth. Princeton: Princeton University Press, 1954. 
Pp. x, 476. $9.00. 


R. A. Gorpon, University of California, Berkeley 


& Richard Ruggles notes in his introduction, the “contributors to this volume have 
concerned themselves mainly with conceptual problems of long-term projec- 
tions....” Only two of the twelve papers attempt actually to project important 
economic variables into the future. Most of the papers are concerned with exploring 
what we need to know and how our analytical techniques need to be improved before 
confidence can be placed in even highly conditional long-term forecasts. 

These papers all deal with some aspect of the problem of making long-term projec- 
tions of the national product. Part I begins with 4 thoughtful and cautious essay 
by Simon Kuznets on “Concepts and Assumptions.” The skeptical tone set in this 
essay is echoed in some of the other papers. The other two contributions in Part I 
deal with the supply side of GNP projections. Harold Wool attempts to project the 
labor force to 1975 by familiar methods. J. W. Kendrick reviews past trends in pro- 
ductivity and considers the problems that must be faced in projecting these trends 
into the future. 

Part II is concerned with specific industry projections. J. P. Cavin briefly reviews 
the agricultural projections made by the Bureau of Agricultural Economics, and 
this is followed by Rex F. Daly’s ambitious attempt to appraise the long-run prospects 
for agriculture. Statisticians will be particularly interested in Harold Barnett’s test 
of the usefulness of the input-output matrix in projecting the outputs of individual 
industries. In general, the familiar multiple-regression technique, utilizing the past 
relation of an industry’s output to GNP and time, gave somewhat better results 
than did the input-output technique, and two other naive models were not markedly 
inferior to the input-output method. The comments by Marshall and Lebergott are 
useful additions to Barnett’s paper. Part II concludes with an attempt by Paul 
Boschan to measure the demand for steel. 

Part III takes up the components of aggregate demand, and it is here that we find 
the greatest unwillingness or inability to specify the precise relationships that can 
be safely projected into the future. Mrs. Smelker’s position is that we do not know 
enough about the determinants of consumption and saving to project these crucial 
variables. Arthur Smithies not only refuses to project government expenditures and 
revenues but deplores “the amount of time and effort that is now going into the 
statistical computations of projections which may have little more validity than at- 
tempts to guess the height of the emperor of China.” Polak is less pessimistic but 
confines himself to consideration of the kind of model that might be used to project 
imports and exports. Fellner’s consideration of long-term tendencies in capital forma- 
tion will be of greater interest to economists than to statisticians. He confines himself 
to the trend in the over-all capital-output ratio, primarily as revealed by Kuznets’ 
data, and tries to reach a judgment as to (a) what was the “normal” value of this 
ratio at the end of the 1940’s and (b) whether this indicated value would have sug- 
gested inflationary or deflationary conditions for the United States in the 1950's. 

The final contribution by Isard and Freutel considers the possibility of regional 
product projections. The paper is an exploratory essay, entirely theoretical in char- 
acter, on the relations between regional and national product. Empirical work in this 
field, as the authors note, is still in its infancy, and the models they explore are in- 
tended primarily as starting points for further research. But this, it would seem, is not 
far from where we stand with respect to most of the problems of projection considered 
in this volume, at the national as well as at the regional level. Our ability to recognize 
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and quantify historical regularities is still very limited, and our ability to project 
these assumed regularities into a future which will vary from the past in unknown 
ways is more limited still, 


The Strategic Role of Inventories. Business Executives’ Research Group in cooperation with 
the Wharton School of Finance and Commerce. Philadelphia: University of Pennsylvania. 
46 pages. Paper. $1.00. 


Ruta P, Mack, National Bureau of Economic Research 


irs 46-page monograph is a joint product of officers of some 30 corporations and 
four members of the faculty of the Wharton School of the University of Pennsyl- 
vania. The first six handsomely printed pages are devoted to “the role of inventories 
in the business cycle.” Here the ratio-to-sales argument is developed. However, when 
dealing with the problem of prediction, the failure of “sophisticated forecast” based 
on historical inventory-sales ratios is cited in favor of essaying a “new approach” to 
forecasting. 

The next six pages follow this new approach and present the results of a question- 
naire in which executives of 22 companies say whether they expect their inventories 
to rise or fall during the next three months and why; also (on the next questionnaire) 
they indicate whether inventories actually did rise or fall. The data refer to the first 
and second quarters of 1954 and show a predominance of expected and actual declines. 
The direction of change in stocks, company by comnpany, was almost always correctly 
anticipated. However, both a priori reasoning and the stated reasons for the change 
indicate that the apparent success in prediction could easily mean nothing more than 
that business men understand seasonal patterns of change. Aid in evaluating the 


approach could have been gleaned from a discussion of how the forecasts were made. 
But no such discussion is reported. Yet the authors conclude that “the results vo 
far... suggest that this approach may lead to a significant improvement over present 


methods of forecasting inventory changes. . . . 

I find little evidence in support of this conclusion in the printed document. There 
is the seasonal problem previously mentioned. Tacre is the further problem involved 
in the word “significant.” If some other less costly method would provide greater 
improvement, the approach via expectations can be by-passed; for, in connection 
with inventories, expectations (unlike the contract-tied expectations about durable 
equipment) have little economic interest in themselves. I suspect that correct expecta- 
tious ure ordinarily well-founded expectations, and a sound basis for founding expecta- 
tions about stocks is the volume of orders on the books. If so, the collection of stetis- 
tics on orders would provide a cheaper and more “significant” improvement in pre- 
dicting change in stocks than expectations. In any event, the reader feels that a 
fine opportunity to improve his understanding of this lively subject has somehow 
slipped through his fingers. 

The rest of the monograph discusses the control of inventories-——“the inventory 
problem from the standpoint of the firm.” It draws on academic analysis of inventory 
control and on the answers to two questionnaires submitted to the business group. The 
first questionnaire bears the title “role of inventories” and is so referred to in the 
text. It consists of fourteen questions, eleven of which relate to inventory budgets 
and their realization. The second questionnaire is on the control of inventories. It 
asks many detailed questions, answers to which must differ for various sorts of stocks 
(supplies, major purchased materials, finished goods) and the -ontrol system appropri- 
ate to each; unfortunately the questionnaire does not provide for specification as 
to the sorts of stocks to which answers apply. In the course of the general discussion, 
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the inventory control problem is never really placed in its proper relation to other 
management problems such as increasing sales or buying advantageously, though 
there are passing bows to such matters. The usual lists of costs are considered and 
note is taken of the fact that estimates of the carrying cost vary from 6 to 25 per 
cent of the value of stocks. The range seems wide, but speculation about its cause fails 
to indicate how the recommended further study would produce substantially better 
estimates. 

The difficulty seems to lie in the fact that business opinion was not taken seriously 
enough either to accept it at face value or to dig for its true meaning and implication. 
We learn that quantity discounts were hardly mentioned; that, except for one retail 
firm, none assigned great importance to obsolescence or storage cost. Half of the 
firms reported that they were influenced by anticipated price movements while typi- 
cally assigning it a minor role and declaring that this was not “speculation.” State- 
ments of this sort and the interpretation placed upon them bespeak inadequate com- 
munication between the questioned and the questioner. 

This is a disappointment. Certainly these seminars were conceived in the convic- 
tion that businessmen’s thinking is the stuff of much fruitful economic analysis. How- 
ever, the results suggest that this thinking is not easily utilized, as it must be, to 
frame the questions as well as to shape answers to so subtle a problem as the role of 
inventories in the firm or in business fluctuations. Perhaps this is simply an illustra- 
tion of the recurrent dilemma that it is only at the end of a study that one feels compe- 
tent to begin it; certainly the group should be commended for its interesting start 


Transport and the State of Trade in Britain. Thor Hultgren assisted by William I. Green- 
wald. New York: National Bureau of Economic Research Inc., 1953. Occasional Papers 
No. 40. Pp. xii, 114, 9”. $1.50. 


K. F. Grover, Epsom, England 


I THIs book, Hultgren examines certain fluctuations in British transport activity, 
following a line of approach similar to that which he employed in his American 
Transportation in Prosperity and Depression, and after tracing the impact of changes 
in the state of trade on various aspects of traffic and management makes some com- 
parisons of British and American experience. 

The first two chapters deal with freight and passenger traffic, the main conclusions 
being that the former fluctuates more than the latter but that the variations in both 
were comparatively moderate. The third, fourth, and fifth chapters discuss fluctua- 
tions in the use of rolling stock, alterations in maintenance policy, and variations in 
the productivity of labor and fuel inputs respectively. In upswings, it is disclosed, 
rolling stock is used more intensively, and maintenance policy, although subject to 
lays, tends to be more vigorous, while better train loadings improve the amount of 
effective work done per man employed and per ton of coal used. Chap. 6 shows that 
the financial results improve in a boom, too. Indeed, it cannot be said that many of 
the conclusions arrived at in this part of the book are very novel. 

The seventh chapter has something to say about traffic and operation since 1938, 
and Chap. 8 compares British and U.S. experience in certain respects. The last chap- 
ter is entitled “Towards understanding cycles” and after commenting that the pur- 
pose of the book has been mainly descriptive, indicates the relevance for a few broad 
types of trade-cycle theory of the sequence of events in transport during fluctuations 
in the national income. 

The main technique used by the author is that of relating the peaks and troughs 
of certain statistical series (tons carried, passenger journeys, etc.) to a system of 
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“reference cycles” or “cycles in business at large” as set out in the chronology associ- 
ated with Wesley C. Mitchell and Arthur F. Burns. Little is said in this book about 
the manner in which these “reference cycles” are arrived at so that it may not be 
easy for the reader to be sure of their relevance. 

The author makes extensive use of railway operating statistics in the course of the 
analysis of railway reaction and contrives to wring a considerable amount of infor- 
mation from them. It is clear, nowever, that statistics of this kind can give misleading 
impressions if interpreted on too high a plane of abstraction. This is brought out on 
page 41 where the author points out that over a certain period the average car load 
for freight traffic as a whole fell, but that the average car load for each class of 
freight traffic considered separately actually rose. The explanation, of course, is 
that there was an increase in the proportionate importance of types of traffic giving 
poor car loadings. It would not be safe to assume that an improvement in the average 
car load for freight traffic as a whole or for any broad group of traffic necessarily 
indicates improved car loadings of any particular traffic. This neatly illustrates one 
of the pitfalls in the interpretation of weighted averages, of which many railway 
operating statistics consist. The author ateers clear of the worst of these dangers. 

He makes a most courageous attempt in Chap. 1 to deal with one of the main prob- 
lems of transport statistics, namely the shortage of information about freight trans- 
port by truck. No figures of the quantities carried by vans and lorries in Great Britain 
are available for the period covered by this book, so that in fact it is very difficult to 
study variations in the volume of transport as a whole, but the author approached the 
problem of measuring variations in the railway’s share of total traffic from another 
direction. Drawing on a wide variety of sources he has constructed for over a dozen 
commodities measures of the total flows coming forward for transport over the years 
1928 to 1938. He has then used the published railway statistics to show what propor- 
tions of the total flow of these commodities traveled by rail. He concludes that the 
diversion of traffic was continuous but went on more quickly in periods of declining 
business activity than in booms. 

These, however, are difficulties of statistical material and sources: the main prob- 
lem raised by the book in the mind of the reviewer is one of language. The word 
“cycle” is frequently used, but it is not at ail easy to see just what it means. In some 
connections the author appears to have the notion of a trade cycle in mind in the sense 
of a fluctuation in the national income resul‘.ing from business influence as opposed 
to governmental action. It is not easy to reconcile the “trade cycle” connotations of 
the word “cycle” with its use in such phrases as the “war cycle.” In fact, the author 
appears to use the word in the sense of almost any fluctuation in transport activity. 

He shows himself aware of these “problems of definition” in his final chapter, saying 
that it is part of the task of description to note differences between the features of 
the “cycles.” But it is difficult to isolate enough instances of railway policy being 
affected solely by fluctuations in effective demand during the period 1918 to 1938, 
on which the author has to rely for many of his analyses. The first part of this period 
was influenced by the amalgamations and the second part by the emergence of a 
rival form of transport, so that even the fluctuations which at first appear to be 
purely of the trade cycle type may in fact have been affected in amplitude or dura- 
tion by noncyclical factors. 

Much of the book deals with what might be termed the “secondary stage” of rail- 
way reaction to changes in the economic environment. The first stage, the fluctua- 
tions in traffic, might be regarded as outside the control of the management, but this 
is not the case with changes in policy, particularly investment policy. These changes 
in policy do not follow automatically from changes in traffic but are the result of de- 
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cisions taken by the managers in the light of their knowledge and opinions of prob- 
able future happenings. Past hic‘ory may tell us something of the outlook of man- 
agement in the past but it by no means follows that similar changes in policy would 
result from comparable changes of traffic at any future date. The managers might 
take a different view another time. 

Hence, partly because of the wide variety of fluctuations considered in this book 
and partly because there is so much room for different managerial reactions in dif- 
ferent circumstances, one cannot be confident that the pattern discovered by the 
author would be a very good guide to the future. The wor: must be judged mainly as 
an economic history and as an exercise in statistical analysis. The industry and skill 
with which the all too few available statistical sources have been pressed into service, 
the care with which the analyses have been carried out, and the brevity with which 
the findings have been expressed, make this an interesting and helpful contribution. 


The Analysis of Family Budgets. S. J. Prais and H.S. Houthakker. University of Cam- 
bridge, Department of Applied Economics, Monograph 4. New York: Cambridge Univer- 
sity Press, 1955. Pp. xx, 372. $9.00. 


Rosert M. Sotow, Massachusetts Institute of Technology 


be ALL of us some of the time and to some of us all of the time it seems that eco- 
nomics fails to make progress as other sciences do. A modest antidote to that 
feeling would be a comparison of this book with Allen and Bowley’s pioneer classic 
of 1935, Family Expenditure. There are no breath-taking new discoveries, to be sure, 
but plenty of solid progress. Taking one thing with another, we know more now about 
the relation of family expenditure to income and household composition than we did 


then. 

What are the sources of this progress? The pure theory of consumer behavior has 
not changed basically since that time; there have been some advances, but they do 
not impinge particularly on this kind of study. There have been important new de- 
velopments in statistical theory, and they do matter here, even though the basic 
statistical tool is straightforward least squares. The basic data are no doubt better 
these days, but Prais and Houthakker deal with two British surveys dating from 
1937-39. In part their enterprise rests on twenty more years of general sophistication, 
of cumulated research experience and insights in this and related fields: Allen and 
Bowley had no Allen and Bowley. In part it rests on a relatively lavish supply of 
computing facilities, including both punch-card equipment and a high-speed digital 
computer. The big advantage here is the possibility of playing around, of trying out 
five different functional forms for Engel curves, of solving non-linear estimation equa- 
tions by iteration, etc. Firally one must admit that the analysis of family budgets is 
one of the more well-behaved parts of empirical economics: the problems are well- 
defined, amenable to standard statistical procedures, not dependent on that econo- 
metrician’s nightmare, the short, not-too-stationary time series. 

Part I contains background material: a chapter on the theory of consumer demand, 
two on the way in which the data were collected and their limitations, a chapter on 
the estimation procedure to be used later (with some useful remarks on the theory 
of grouping commodities), and finally one containing household hints on the organi- 
zation of research projects including lots of data and computation. 

Part III, which occupies almost 200 pages, reproduces the basic tables from the 
two budget studies on which Part II is based. This material (one of the surveys 
covered working-class families, the other covered certain groups of civil servants) 
is not elsewhere available; the authors have done an expensive but noble act in offering 
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the data for further research. As they remark, the period from 1939 until very recently 
was characterized by extensive rationing and other direct controls. Thus from the 
point of view of behavior in a free market, these surveys are practically “recent.” 
There is some evidence from England that with the end of controls, many pre-war 
demand relationships have begun to re-establish themselves. 

Part II is the heart of the book. Its first chapter is devoted to the Engel curve. 
In order to concentrate on income elasticities, the authors eliminate the cruder effects 
of family size by setting up the provisional working hypothesis that expenditure per 
person on any commodity depends only on income (more exactly, total expenditure) 
per person. This is certainly close enough to the truth so that they can proceed to 
explore alternative functional forms for the Engel curves. They try five, sticking to 
functions which are linear in the parameters. The linear Engel curves used by Allen 
and Bowley fall by the wayside on almost any criterion. Prais and Houthakker settle 
by and large on a semilogarithmic form (expenditure per person a linear function of 
the logarithm of income per person) for foods and a double-logarithmic (constant 
elasticity) form for most nonfoods. (Among nonfoods is a group of expenditures on 
tobacco, liquor, and amusements, which the authors have chosen to describe under 
the general heading “Vice” !) 

Some of the possible functions were chosen to allow for the existence of saturation 
phenomena. That the best choices have no saturation levels may only indicate that 
neither manual workers nor civil servants were anywhere near saturation in 1939. 
Expenditure elasticities at mean income are given for some 116 different commodities, 
and separately for the working-class and middle-class groups. 

One matter of statistical technique is worth describing here. The authors are con- 
cerned with testing regressions of the form y=a-+bf(z) for linearity. The trouble 
with the usual analysis of variance procedure is that in cases like this one usually 
has a special class of alternatives in mind, aamely those in which the departure from 
linearity is “smooth.” This will have the effect that residuals at neighboring valucs 
of f(z) will tend to have the same sign, and tend to be “seriaily” correlated. Thus 
Prais and Houthakker are led to base a test for linearity on the serial correlation or 
the related von Neumann ratio of the residuals, or even more simply on a statistic 
like the total numbers of runs of positive and negative residuals. Testis like this should 
have good power against the kinds of alternatives that matter. They also have the 
advantage that there is no need to have severa! observations at each value of the 
independent variable. 

The next chapter goes on to consider quality variations in the consumption pattern. 
For those commodities where a quantity figure is given along with expenditure, it 
is found that average price paid tends to vary positively with income and negatively 
with household size. This is interpreted as reflecting quality differences. The authors 
adopt an interesting model in which average price paid is made a linear function of 
the logarithm of income per person, and get some excellent results. I find myself 
uncomfortable with the notion of measuring quality by price; it is hard to see how 
the theory of consumer demand could accommodate itself to this kind of formulation. 

There follows a clear and interesting pair of chapters on unit-consumer equivalences 
and the related problem of economies of scale in consumption. Prais and Houthakker 
adopt a formulation which distinguishes scales for each coramodity and one for in- 
come. Their model of the household comes down finally to: 


v./( yi king) - filv./( ZZ ken y%] 


where »; and v, are expenditure on the ith commodity and total expenditure respec- 
tively, ki; is the equivalence measure for a person of type j on the scale for commodity 
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1, ko; is the same thing on the income scale, n; is the number of persons of type j, and 
(1 —@;) and (1 —6@,) measure specific and general economies of scale. The estimation 
of equivalence scales from budget data is a thoroughly non linear problem. The 
authors make some sample computations by an iterative method with the added 
simplifying assumption that the income scale gives equal weight to all persons. 
Specific scales are estimated for six food groups and for all foods together. Economy- 
of-scale coefficients are estimated for the same food group, and also for a few non- 
foods. 

There is a brief chapter on social, regional, and occupational factors in consump- 
tion, and a modest few pages of conclusions. All in all, a most successful book. 

There is a minor slip in equation (815) on p. 53, so that (5.16) should 
read: r;~(2—d/2). Also, all the signs seem to be reversed in the table on p. 96. 


Student Spending at Indiana University, 1951-1952. Mary M. Crawford, Stanley Stein- 
kamp, and Edward L. Hauswald. Bulletin of the School of Education, Indiana University, 
Vol. 31, No. 6. Bloomington: Division of Research and Field Services, 1955. Pp. 82. $1.00. 


Survey J. Heinze, University of Chicago 


HIs is the second major study of student spending at Indiana University. The 

first was conducted in 1940-1941 and was concerned primarily with patterns of 
expenditures among the students. The present study, conducted during the academic 
year 1951-1952, cites five objectives: (1) to determine how much was spent by the 
students during the academic year, (2) to see whether there are differences among the 
various student groups with respect to patterns of spending, (3) to compare spending 
in 1951-1952 with spending during 1940-1941, (4) to determine the sources of stu- 
dents’ finances, and (5) to work out a method for collecting and analyzing data relat- 
ing to student expenditures which might be used in making similar studies at other 
universities. 

The group studied was composed of 678 students selected at random from seven 
strata in the population. These strata were: (1) organized men, (2) organized women, 
(3) dormitory men, (4) dormitory women, (5) out-in-town men, (6) out-in-town 
women and (7) married men. The term “out-in-town” refers to single, independent 
men and women not living in University housing. The marital status of male students 
took precedence over all other classification criteria. The fraternity or sorority mem- 
bership of men and women took precedence over housing criteria in determining 
their classification. 

The data for the study were obtained by personal interviews conducted with the 
use of “fixed-alternative” questionnaires. Separate schedules were employed for 
single and for married students. The schedules for student spending appear in the 
appendix. However, questions regarding income do not appear there. It is not clear 
whether the wording of these was left to the individual interviewers or was fixed but 
not written on the schedule. The actual recording of the income information was 
handled by the respondents themselves. Cards were supplied for them to fill in and 
place in sealed envelopes. 

The study is reported in a straightforward manner with a minimum of interpreta- 
tion. The findings, in broad, tend to support one’s general impressions of student 
spending. Except for out-in-town men who spent more for University fees, textbooks 
and supplies, and for recreation than for clothing, food and clothing were the two 
items that headed the list of expenditures for single students. In every stratum men 
spent more for food than for any other item. Except for those who lived out in town, 
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the women spent more money on clothing than on food. The married students spent 
more money for rent than did any other group, but their expenditure for clothing was 
less. A larger part of the married men’s total expenditure went into food than was 
true for the single men. The average total spending was noticeably lower for inde- 
pendent than for organized students. Money received from home was the major 
source of income for all single students. 

In comparing the results of this study with those of the 1940-1941 study, the inves- 
tigators found that, with the exception of rent, University fees, textbooks and sup- 
plies, and dues, the single students spent as much or more during the first semester 
of 1951-1952 than they did during the two semesters of 1940-1941. The reader is 
warned, however, that the data from the two years are not strictly comparabie since 
personal interviews were not conducted in the earlier study. In general, the greatest 
proportional increases in total expenditures since 1940-1941 were for clothing, recrea- 
tion, transportation, and health. 

On the whole, the study appears to have been carefully planned and executed. The 
strata were well chosen. However, the value of the data would undoubtedly have 
been increased had those students living in town with relatives been treated sepa- 
rately. While only a small proportion of the total sample had such arrangements, 
women were predominant among those who did. Hence we would expect this factor 
to have a differential effect on the mean expenditures for male and female students. 
Further, this undoubtedly depresses the mean expenditures for the out-in-town group 
generally. 

This study constitutes an improvement over the earlier one in that income data 
were collected and personal interviews were conducted. Unfortunately, the method 
of collecting the income information tends to defeat the purpose of using technically 
trained interviewers in that the investigator is deprived of the information which 
skilled probing could provide. 

Obtaining accurate information regarding income and expenditures has concerned 
economists and survey methodologists for some time. In general the attempt to in- 
crease accuracy has been made through increasing the amount of detailed information 
requested. There is good reason for believing that more valid and reliable information 
can be obtained with respect to expenditures on specific items than is usually given 
in response to general questions about classes of expenditures. Further, to the extent 
that detailed questions are not asked, we may expect mean incomes or expenditures 
of various groups in the population to be differentially affected. Insofar as we are 
interested in comparing classes which cut across the groups thus affected, the conse- 
quences may not be too serious. However, when the differentially-affected groups, 
themselves, constitute the classification we wish to employ in comparisons, the conse- 
quences will be greatest. In a study such as the present one, this is a particular prob- 
lem. In this case, quite general questions about food expenditures may suffice for 
that portion of the sample which has institutional arrangements. These students 
may simply report the amount they are assessed for food each month, assuming they 
are given such a breakdown in their bills. On the other hand, those who eat out or 
prepare their own meals to any extent will give much less accurate figures. Whether 
errors may be expected to cancel each other is a moot point. The investigators 
were aware of this problem since they asked students who ate out to any extent how 
much they had spent for their last breakfast, lunch, and dinner. However, they also 
asked: “What is the average cost of your meals per week?”, and it is not clear which 
of these questions was used in the final computations. Students living in apartments 
and married students were asked: “What is your average grocery bill per week?” as 
well as what they had spent on meals eaten out during the week. The generality of 
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these questions would lead us to expect a much greater response error for these groups 
than for those living in dormitories, fraternities and sororities. It may well have been 
that the generality of the questions was offset by skilled probing on the part of the 
technically trained interviewers used on the study. Unfortunately, the reader is not 
in a position to determine the degree to which this is the case. 

Despite problems of the types discussed here, the study was, on the whole, well 
planned and carried out. The statistics appear to have been competently handled. 
The report may very well be of interest and value to administrators wishing to con- 
duct similar studies in their own universities. 


Contributions of Survey Methods to Economics. Lawrence R. Klein, George Katona, John 
." 9 Tneing. ane aoe N. Morgan. New York: Columbia University Press, 1954. Pp. viii, 
ndexe 


Georce H. Brown, Ford Motor Company 


HIs book is a collection of articles prepared specifically for joint publication by 

the four staff members of the Survey Research Center of the University of Michi- 
gan who are most concerned with the application of survey methods to the study of 
business cycles. Following a six-page introduction by Lawrence R. Klein, which 
summarizes the findings of the studies presented, there is a forty page discussion by 
John B. Lansing of the concepts and definitions used in the Surveys of Consumer 
Finances which provide the data used for analysis in the succeeding articles. The 
second article, contribute’ by George Katona, advances the notion that certain 
aspects of both consumer spending and savings are contractual or habitual and 
other aspects of spending and savings are variable through time since they are outlays 
of choice and suggest that consumer income expectations, past and expected price 
changes, and the consumer inventory situation are primary factors influencing what 
proportion of the outlays of choice are devoted to spending or savings. Data for the 
years 1947 and 1948 for 655 families are presenied, showing that the direction of 
saving or dissaving remains constant for two-thirds of the families for the two cal- 
endar years but that for one-third of the families there was a shift from saving to dis- 
saving or vice versa (approximately equal proportions moving in each direction). 
This is followed by a brief presentation of data showing (a) variation through time 
(1951-1952) in consumer opinions whether or not it was a good or bad time to buy 
durable goods and (b) a moderate relationship between the opinions of individual 
consumers on the general economic outlook and their opinion about buying condi- 
tions for durable goods. 

The next two articles in the book were written by James N. Morgan. Both articles 
deal with the application of analysis of variance to the residuals from the regression 
of savings (defined to include the purchase of durables) upon income. Most of the dis- 
cussion concerns problems of methodology, such as the procedures used in “homoge- 
nizing” the residuals, the elimination of unusual or abnormal cases from the data, 
and the exploration of alternative means of analyzing the data. It was discovered 
that the relationship between the level of liquid assets and the level of savings is 
anything but simple, either for a particular level of income or between income groups. 
The data also reveal a relationship between savings (as defined) and home ownership, 
size of city, and the position in the “life cycle” (i.e. young married versus old retired) 
of the spending unit, although here again the nature of the relationship is neither 
clear nor simple. 

The book closes with two articles by Lawrence R. Klein. The first and longer deals 
with the procedure and results of estimating multi-variate savings relations by mul- 
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tiple regression techniques, using current year disposable income, prior year disposa- 
ble income, beginning of year liquid assets, number of persons in the spending unit, 
age of head of spending unit, and home ownership as independent variables. In con- 
trast to Morgan, Klein excludes the net outlays on consumer durables, other than 
housing, from his definition of savings. The regression technique is first applied to 
the 1948 data for the 655 families for whom both 1947 and 1948 income savings, 
liquid assets, and expenditures on durables were available. The residuals from the 
multiple regression analysis were then studied in relation to expected income change, 
general economic outlook, and actual income change of the individual spending units. 
The results of a similar analysis for the data in the 1950 and 1951 Surveys of Con- 
sumer Finances are also reported, as well as a brief analysis using expenditures on 
durable goods as the dependent variable. The second article by Klein discusses the 
areas where additional analysis of survey data is needed in business cycle research. 

In spite of the prodigious effort that has gone into the collection and analysis of 
the data in the Survey of Consumer Finances, the progress as reported in the series 
of articles comprising this book is disappointing. Aided by the 20-20 vision of hind- 
sight, one can see that belief in the richness of the ores in the newly discovered 
source of data led the research group to hurry into the analysis without too much 
attention to statistical method or economic theory. For example, the discussion of 
statistical method is consistently in terms of processing and adjusting the data to 
make it conform more closely to the requirements of the methods of statistical analy- 
sis to be used. There is no discussion of analysis procedures that would fit the data 
to be analyzed, as for example the analysis of variance in the case of unequal numbers 
of observations in the sub-classes. Apparently little or nothing has been done with 
the problem of introducing into vigorous analysis those psychological measurements 
of attitude or intensity of feeling which are unlikely to have a uniform relationship 
to the dependent variable from one interval to another. As a result of this lack of 
attention to basic development of statistical method the statistician is likely to find 
little that is new or different in the methods described, other than one more case 
where ingenious devices have been developed to meet the requirements of percent 
statistical theory. 

The same situation is true with respect to the contribution the collection of articles 
makes to a greater understanding of the process of the bus‘ness cycle. The concept 
of savings as used in the studies not only changes from ore analysis to another, but 
fails to segregate the savings activities iato categories meaningful for the analysis of 
business cycles. It lumps together the increase in bonds, savings accounts, and 
checking accounts, the excess of purchases over sales of real estate, and repayment 
of debt. While all of these actions are savings in the sense that the consumer is 
acquiring the rights to services to be rendered at some future period of time, the im- 
pact of these several actions on the volume of goods produced is likely to be different 
both at any point in time as well as through time. The failure to consider depreciation 
as dissaving grossly overstates the “savings” of home owners and closes the door to an 
analysis of the influence of inflation on the income-spending complex. The difference 
in treatment between the purchase of a house and the purchase of other consumer 
durables makes it extremely difficult to understand and interpret the findings of the 
statistical analysis. There is, moreover, no effort to understand the forces leading to 
a net increase in the holding of liquid assets, land, buildings, securities, stock of 
durable goods, quality of durable goods stocked. Much of this must be charged to 
haste in finding some crude approximation to the concept of savings as used in the 
theory of business cycles in order to get on with the analysis of the data being ac- 
cumulated. As a result, the economist will find that the major contribution of survey 
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methods to economics is to call attention again to the well-known deficiencies in 
present theories of the business cycle. 

The failure of traditional statistical methods and simple economic models to pro- 
duce great insights into the problems of the business cycle does not mean that such 
insights cannot be obtained through the analysis of survey data. It means, rather, 
that the new data call for new statistical methods and a more carefully developed 
statement of the relationships between various types of consumer expenditure pat- 
terns and the flow of goods and services. Both the introductory and concluding 
articles of this book indicate that these points are well known to the contributors 
to this volume. 


Industrial Censuses and Related Enquiries. Statistical Office of the United Nations. 
Studies in Methods, Series F, No. 4, Vols. I and II. New York: Columbia University Press, 
1955. $2.50 per volume. 


F. J. Rasutey, Dominion Bureau of Statistics 


ES to the foreword of Volume II, these two volumes were prepared as part 
of the response to the request of the United Nations General Assembly to prepare 
guides for the organization and collection of economic data in undeveloped countries 
[General Assembly resolution 407 (V)]. Volume I contains basic text, including a 
history of country programs (in many ways, the most fascinating and informative 
section), recommendations re scope, coverage, and concepts and a discussion of goals 
and techniques for less industrialized countries, with appropriate reference to the 
major industry and commodity classifications developed by the U.N. and suggested 
as the framework within which individualized classification schemes may best be 
developed. 

Since the volumes are, in fact, a digest of the experience of industry statisticians 
in many countries, modified by, added to, and seen through the iniegrating recom- 
mendations of the U.N., they are not susceptible to critical review on the basis of a 
partisular reviewer’s biased response to some minor conceptual or procedural de- 
cision or instruction: rather, they are to be commended as very largely achieving 
their purpose, i.e., to assist countries that have not yet developed systems of indus- 
trial statistics, while providing, as well, a broad and unified statement for the use 
of industry statisticians in all countries. These latter, in the pursuit of the solutions 
of special problems, may need to be reminded of the desirability of comparability 
between series, a desirability not limited to the international scene, but remarkably 
pertinent te their own more circumscribed efforts. In the use of the resources avail- 
able, which are always limited in terms of the wished-for ends, they may be aided 
in a re-assessment of each part of their own effort by noting the surveys that are 
recommended as basic. The foregoing are examples only, and there are a number of 
other instances that might be given. 

The volumes no more than hint at three matters so basic that it might almost be 
said that there are no other real problems in industry statistics: (a) the definition 
and classification of the “establishment” (the accepted reporting unit), (b) the kind 
of values or value levels that are both meaningful and capable of collection, and (c) 
the statistics that defy the ordinary “establishment” basis for the collection of data 
and require the use of an “enterprise” concept. Leaving (c) to the volumes that ulti- 
mstely may be written about it (for the idea of the “enterprise” as opposed to the es- 
tablishment is already generating a controversial future, far wider in its implica- 
tions than the precise problems of collecting certain pieces of data on one basis or 
another) the suggestion offered here is that (a) and (b), the definition and classifica- 
tion of the “establishment” and the kind of value data collected therefrom represent 
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the main challenge to the industry statistician. He must apprehend the real facts 
of life in the individual units that go to make up the total iife of the larger universe 
that he is attempting to investigate and to measure, he must saturate his concepts 
and all the rest of his mental and physical machinery with that apprehension and 
then, with a disciplined, imaginative effort, devise his own particular series. Follow- 
ing this, he commences to make the tens of thousands of individual decisions which 
alone brings the series into being. 


Methods of Collecting Current Agricultural Statistics. R. D. Narain. Rome: Food and 
Agriculture Organization of the United Nations, 1955. Pp. 300. $3.00. 


Watrter A. Henpricks, Agricultural Marketing Service, U.S.D.A. 


FY has here made a noteworthy start on a difficult undertaking—the issuance 
of “a publication which will bring together in comparable form the methods 
used in all countries in the world in the collection of agricultural statistics.” It is 
bound in loose-leaf form with the intention of being kept “up-to- late by incorporating 
fresh material as countries’ methods change and new information becomes available.” 
This reviewer was told that the price of the publication covers the cost of future 
supplements also. 

It would be out of the question to expect any work of this nature to give a detailed 
picture of the statistical activities of any one country, even when the field is re- 
stricted to current agricultural statistics. Altogether 89 countries in Europe, Asia, 
North and Central America, South America, Africa and Oceania are covered, in the 
hope “that the brief comparative study in this manual will help countries to view 
their own statistical systems in a broader perspective, and thus stimulate further 
thought and action towards improvement.” 

The description of the statistical system for each country is given in outline under 
the headings: 1. Administrative Division, 2. Agricultural Statistics Collected, 3. 
Collection of Data (Unit of enumeration, Type of enumeration, Method of enumera- 
tion), 4. Sampling Methods, 5. Organization for Collection and Tabulation (Field 
organization, Organization at head offices), 6. Time Schedule, 7. Processing of Data, 
and 8. Publications. 

An excellent Introduction reminds us of FAO’s activity in promoting improvement 
in current agricultural statistics and presents a statement of the case for the employ- 
ment of modern statistical techniques in that field with which no competent statisti- 
cian can justifiably disagree. 

The work gives the reader little more than a capsule of information on each country, 
but it is a pleasant surprise to discover how much essential material has been com- 
pressed into a few short sentences under each major heading. Anyone with an inter- 
est in this field cannot fail to be impressed by noting developments actually taking 
place today all over the world. The author modestly refrains from calling attention 
to what most of us already know—that is the great extent to which FAO itself has 
spear-headed the entire movement of establishing and improving these systems for the 
collection and publication of current agricultural statistics. 


A Statistical Study of Livestock Production and Marketing. Clifford Hildreth and F. G. 
Jarrett. Cowles Commission for Research in Marketing Monograph Number 15. New 
York: John Wiley and Sons, Inc., 1955. Pp. xiii, 156. $4.50. 

Ivan M. Les, University of California, Berkeley 


em statistical results reported in this monograph are estimates of coefficients in 
five of a system of eight relations constructed to represent the generation of price, 
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production, and sales (at the farm level) of a rather broad livestock products aggre- 
gate. Entering the basic model are a production relation, three farm decision rela- 
tions—demand for feed grain, demand for protein feed, and supply of livetsock 
products—demand for livestock products (viewed as deriving from consumer demand 
and processor behavior relations), two feed supply relations—feed grains and protein 
feeds—and an inventory relation. Attention is centered on estimates of the first five 
relations. 

The organization of content and style of exposition are excellent. A brief develop- 
ment of the initial economic model in Chapter II is followed by a long chapter indi- 
eating sources of data and describing the measures constructed to represent the 
variables specified in the model. In the development of measures, particular attention 
is given to their conceptual suitability for the problem as conceived. Econometric 
specifications are added to the model in Chapter IV, and the initial estimates of rela- 
tions presented. In this version of the model, the variables, except time, enter in 
logarithmic form. The following three chapters are devoted, one each, to the produc- 
tion reiation, the farm decision relations, and the demand for livestock products 
relation. Additional interpretation of initial results, statistical tests of serial inde- 
pendence and overidentifying restrictions, and a few modifications in the initial model, 
with supporting discussion, form the main content of these three chapters. Estimates 
of certain modified forms of the relations are included. A final chapter is given over 
to prediction tests. Estimates of the parameters being based on annual data for 
1920-1949, 1950 observations are used in these tests. Computed residuals from the 
five equations of the original model (linear in logarithms) as well as two of the modified 
relations estimated (demand for feed grains and demand for livestock products— 
linear in arithmetic values) are presented for comparison with 95 per cent acceptance 
intervals, based on least squares estimates, and computed residuals from two “naive” 
models, 

As in other recent investigations employing modern econometric methods, the 
authors have followed the practice of presenting both limited information and 
direct least squares estimates of each of the relations estimated. And from a statisti- 
cal point of view, no real differences in the estimates from the two procedures can 
be cited. While some readers will no doubt regard the similarity of results as addi- 
tional evidence of the adequacy of direct least squares methods for the estimation of 
economic relations, the authors of the present monograph choose rather to point up 
the limitations of estimates from either formulation in this instance. Since, on the 
whole, the results from this study are as “good” and as “useful” as other similarly 
conceived earlier studies, the discussion of limitations may be regarded as having 
direct relevance to the earlier studies as vrell. 

It is clear that the authors have not regarded substantive results as a primary 
objective of their study. Nor is their purpose primarily illustration of estimating 
techniques, since the techniques employed can no longer be regarded as novel. In 
the preface they state, “Useful empirical results were sought, but the main emphasis 
was placed on the development, application, and testing of methods that might 
prove effective in analyzing interrelated segments of economic activity.” The purpose 
is, therefore, methodological, not in the abstract, but with more specific reference to 
the substantive problem formulated. In this vein, the results presented serve as a 
point of departure for developing or clarifying points of method with, of course, the 
improvement of substantive content held in the background as the ultimate objective. 

It is perhaps unfortunate that limitations of research resources and limitations 
of data prevented casting the model in a little different mold. In particular, a less 
aggregative formulation (primarily in the product and geographic dimensions) might 
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have proven rather interesting. Presumably, the methodological point of view could 
have been incorporated as effectively in a less aggregative study and progress toward 
the ultimate attainment of useful results might have been advanced rather more than 
is apparent from the highly aggregative version reported here. In the absence of a less 
aggregative and more complex model as a basis for actual estimation, more explicit 
formulation in terms of less comprehensive aggregates supplemented by relations con- 
nectiag the less aggregative and more aggregative versions might have served more 
effectively to advance our understanding of the relations for which estimates are pre- 
sented. 

The comments above with respect to formulation are not intended primarily as 
critical remarks but rather as reflecting concurrence in similar observations made by 
the authors themselves in their appraisai of results. Returning to the framework 
within which the authors have chosen to conduct their investigation, this monograph 
reporting research conducted is regarded by this reviewer as very well done. The ex- 
cellent organization of ideas and style of exposition, the care with which the measures 
enterinz the analysis are reviewed and reconstructed for conceptual suitability, the 
caution exercised in the interpretation of results, and the attention throughout to the 
methodological considerations underlying procedural decisions combine to give a 
refreshing flavor to the point of view from which the problem of estimating economic 
relations is approached. Quantitatively minded agricultural and general economists 
interested in bringing the methods of econometrics to bear on substantive problems 
should find a careful reading of this short monograph quite rewarding. 


Construction Review. Volume 1, No. 1, January, 1955. Washington, D. C.: U. 8S. Govern- 
ment Printing Office. Pp. 52. Paper. 


— new subscription monthly of the Departments of Commerce and Labor ap- 
peared to replace Construction, published formerly by Labor, and Construction and 
Building Materials, published by Commerce. It contains virtually all of the govern- 
ment’s current statistics that pertain to construction, plus information from some 
nongovernmental sources. There will also be articles on specific aspects of construction 
and an index to tables which appears on the inside of the back cover. 

Construction Review is for sale by the Superintendent of Documents, Government 
Printing Office, at $3.00 per year (12 issues) domestic or $4.00 foreign; 30 cents a sin- 
gle copy. D. D. F. 


Soviet Industrial Production 1928-1951. Donald R. Hodgman. Cambridge: Harvard Uni- 
versity Press, 1954. Pp. xix, 241. 


Josepn A. Kerspaw, The RAND Corporation 


N RECENT years a number of first-rate monographs on various aspects of the Soviet 

economy have appeared. These are largely by young men and women who have 
been trained at the Harvard or Columbia Russian research centers, both of which 
were established after the War. The monographs, more of which are imminent, repre- 
sent a gratifying return on that investment, and must be particularly rewarding to 
those who conceived the need for such research and study and engineered the initial 
establishment of the facilities. 

Hodgman’s study of industrial production is one of these. It is a solid piece of re- 
search on a difficult subject. It is careful and painstaking, with all its many limitations 
laid bare for the reader to ponder. The methods are clearly outlined and by reference 
to the inevitable appendixes one can reconstruct any table or check any figure. This 
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is most important in research of this type. As Gerschenkron puts it in a Preface, 
“More than in any other field of research, the rule of the scholar in Soviet economic 
research must be to give the reader as full an opportunity as is humanly possible to 
follow the writer’s use of the original data.” 

After a review of the official Soviet government indexes of industrial production in 
Chapter 1, and of the reasons why those indexes exaggerate the real rate of growth, 
Hodgman sets out to construct his own index. The thing that distinguishes his index 
from others is the weighting system he uses. He would like to use value-added weights 
and chooses, therefore, payroll data for 1934, payrolls being as close to value-added 
as the available data will permit him to come. These weights are applied to the phys- 
ical outputs of as many industries as he can find data for. It is interesting to follow 
Hodgman through his difficulties as he attempts to implement this apparently simple 
statistical decision. For example, he finds that he must derive his payroll data by mul- 
tiplying the average wage in each industry by its employment; but there turn out to 
be two sets of employment data avai’able from the Central Statistical Administration 
with no obvious logical ground for choice between them. Or the wage data are avail- 
able only for aggregated groups of industries. And so on through many problems. 

The resulting index of industrial production is much more reliable and compre- 
hensive for the pre-war period than for more recent years. For the years 1928-1937 
the index includes 137 products, which Hodgman estimates covers about half of net 
value-added in industry in 1934. The most important omissions are lumbering, repair 
plants, a part of machinery, metal wares, needle trades and nonmetallic minerals. 
For the postwar period many of the sources dry up and Hodgman regards the exten- 
sion of the index to this period as “an especially dubious procedure.” He is able to 
include only 22 products between 1946 and 1950 and only 18 in 1951. His post-war in- 
dex contains ne chemicals, very few machinery items and no military items; these 
omissions, it may be noted, are commodities whose growth was most likely greatest. 
Hodgman’s caveats on the inadequacies of his index in this period are very much 
in order. 

Hodgman’s index confirms the notion that industrial production has grown at a 
remarkable rate in the USSR. In 1937 the index stands at 371, with 1928 equal to 100. 
One has a feeling that the increase in the 1930’s is pretty well described by Hodg- 
man’s index. He compares it to other indexes in a most effective chapter which con- 
tains a careful, calm, objective and, to my mind, devastating description and an- 
alysis of work done by Naum Jasny and Colin Clark in this same area. Incidentally, 
since Hodgman’s book was published, Gerschenkron has summarized his own work 
on Soviet industrial production. Gerschenkron’s index covers only heavy industry 
(coe’, iron, steel, machinery, electric power and petroleum) and he weights his outputs 
with 1939 U. S. dollars. The weighting system would make his index rise less rapidly 
than Hodgman’s, but the items he covers would make it rise more rapidly. His result 
for 1937 is 370, based on 1928/29 as 100, which under the circumstances at least 
tends to confirm the Hodgman result. 

As already indicated, the Hodgman index after 1937 and in prticular since the 
war is nowhere near so satisfactory. It is nonetheless interesting and shows a con- 
tinuation of a high rate of growth after the interruption of the war. By 1950 it stands 
at an amazing 646, fifty per cent higher than 1940. 

There is some question as to whether the Hodgman index includes small-scale in- 
dustry.? This is a problem of some significance, at least in the early 1930’s, since a part 





1 “Soviet Heavy Industry: A Dollar Index of Output, 1927 /28-1937," The Review of Economics and Statistics, 
Vol. 37, May 1955. Pp. 120. 
® Defined as establishments with 16 or less employees if power is used, and 30 or less when no power is used, 
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of the growth of large-scale industry took place because of the absorption of small- 
scale inaustry. Hodgman calls his index a “large-scale industry” index; it is clear that 
his weights are payrolls applying to employment in large-scale industry only, but the 
physical output series are apparently large-scale only in some cases and all industry in 
others. A check with Socialist Construction 1936, for example, one of Hodgman’s many 
sources, indicates that output data for the years 1932 to 1934 are for all industry in 
every case except shoes and vegetable oils. Curiously, Hodgman is never explicit on 
this matter. 

A final chapter discusses a few fairly interesting implications of the findings, al- 
though this discussion is not up to the rest of the book. There is a section on labor 
productivity (in the aggregate) one on changes in output of consumer goods per 
capita over time, and one on international comparisons of industrial growth. These 
are rather pedestrian, or perhaps simply a let-down from the pioneering research re- 
ported in the earlier chapters. One specific point should not go unnoted: when com- 
paring industrial production indices of the USSR and a number of western economies, 
one would think long and hard to find a more misleading statistical device than the 
equation of 1932 to 100. Yet there it is, big as life, in Table 26 on page 128. The auth- 
or’s explanation of why he chose 1932 (on p. 127) indicates that his mind was on 
something else. 

In summary this is a first-rate piece of research. For students of the Soviet economy 
it will be a standard reference. For the statistical practitioner as well it has much of 
interest. 


Introduction to Demography. Mortimer Spiegelman. Chicago: The Society of Actuaries, 
1955. Pp. xxi, 309. 


Rosert G. Porrer, Jr., Office of Population Research 


T THE request of the Society of Actuaries, Mr. Spiegelman has written a general 
A introduction to the methods of demography. Although the author speaks of de- 
signing it for the actuarial student, its content is general enough to interest a far 
wider group. This is fortunate because the textbook is a superb one, being tightly or- 
ganized and meeting high standards throughout its length. 

Space is evenly divided between a number of topics: collection of census and vital 
statistics (22 pp.); errors in such data and their adjustment (24 pp.); morbidity (27 
pp.); family formation, composition, and dissolution (19 pp.); fertility and reproduc- 
tion (28 pp.); migration (25 pp.); working population (25 pp.); and population esti- 
mates and projections (25 pp.). The treatment of mortality is relatively more in- 
tensive, with 53 pages apportioned among chapters on mortality rates, construction 
of life tables, and mortality projections. 

In his introductory chapter the author emphasizes (p. 4) that a proper inter- 
pretation of demographic statistics requires (a) that one have a “clear and precise 
understanding” of the descriptive terms used, (b) that one ascertain the quality of 
the data, and (c) that one study critically the computation by which the original data 
are turned into summary measures. The chapters which follow seek to train the reader 
for such appraisals. Basic concepts are described in terms of their operational defini- 
tions. These definitions are briefly discussed in relation to practical application and 
problems of error. The main sources of data are cited. The author is especially suc- 
cessful with his compact surveys of methods for converting raw data into summary 
measures. The substantive aspect is not ignored but in any chapter is usually confined 
to two or three pages summarizing the established generalizations which pertain to 
the United States or Canada. 
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A few particulars are worth mentioning. Chapter 3 discusses the detection and ad- 
justment of error in census and vital statistics. The treatment of age misstatements is 
especially complete. Chapter 7 is of special interest because it shows that morbidity 
is finally emerging as a field of analysis in which there is some consensus over defini- 
tions, a definite set of methods, and a respectable body of empirical data. Like Chap- 
ter 3, this chapter should be valuable to student and practitioner alike. The concept 
of a specific or corrected rate is thoroughly exemplified in Chapter 4 (“Measures of 
Mortality”). Chapter 5, dealing with life table construction, goes into more technical 
detail than other chapters. This emphasis is justified by the fact that in other chapters 
uses of the life tables are described with respect to morbidity, nuptiality, risk of or- 
phanhood, labor force participation, fertility, migration, and population estimation 
and projection. Chapter 9 (“Fertility and Reproduction”) includes a résumé of 
“intrinsic rates.” Chapter 8 (“Family Formation, Composition, and Dissolution”) 
and Chapter 11 (“Working Population”) focus on operational definitions and sub- 
stantive aspects, since the measures involved are nearly all adaptations of measures 
described earlier in the book. In both Chapter 10 (“Migration”) and Chapter 12 
(“Population Estimates and Projections”) an impressively wide variety of measures 
are compared. £ 

There are additional reasons for believing that the text is an unusually good one. 
The style of writing is succinct and lucid. Only rarely does it become so terse or de- 
tailed as to give difficulty. A detailed table of contents is provided as well as name 
and subject indexes. Almost no footnotes appear. Instead a careful list of references, 
grouped by chapter sub-sections, is given in the back of the book. 

This book will have several functions. For the practicing demographer or actuary it 
will serve as a valuable reference volume on the basis of its careful name and subject 
matter indexes, its compact surveys of methods in the text, and the keying of this text 
to an up-to-date, selective bibliography. It will meet admirably its stated purpose of 
introducing actuarial students to the methods of demography. Regarding the non- 
actuarial student, its usefulness will be great in the rare population course which is 
two semesters long and devoting one of these semesters to methods. But for the gen- 
eral, one-semester course its appropriateness as a main text is questionable. The focus 
on methodology makes it a specialized book. Nor is all of it assignable unless a back- 
ground of elementary calculus is assumed. Mathematical derivations are avoided but 
where possible demographic measures are expressed in mathematical notation. In 8 
out of 12 chapters these expressions involve nothing more advanced than simple or 
weighted averages of rates; but in sections of four chapters use is made of notions of 
integration, differentiation, or finite differencing. Furthermore, supplementary read- 
ings or instruction will be needed. The book provides no practice problems. Also the 
discussions of life table construction and intrinsic rates move rather quickly into 
advanced material. The reviewer doubts that a beginner can handle these sections 
unassisted even if he has the necessary mathematical background. 

In summary, this is a splendid, specialized textbook on population methods which 
will interest the practitioner and the advanced student, but not the beginner. 


An Estimate of Metropolitan Chicago’s Future Population: 1955-1965. Donald J. Bogue, 
Oxford, Ohio: Scripps Foundation for Research in Population Problems, 1955. Pp. 8. 
Paper. 


A. H. LeNevev, Dominion Bureau of Statistics 


- pone report gives the estimated population of the city of Chicago and of the sub- 
urban ring of the metropolitan area by age and color for 1955, 1960, and 1965, as 
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well as the estimated number of dwelling units neeeed to house the “medium” esti- 
mated population for these years. The estimates for 1960 and 1965 cover a range of 
probable future growth in population, the “high” estimate reaching 6,956,000 for the 
standard metropolitan area in 1965, the “low” 6,526,000. All three estimates for the 
city of Chicago itself in 1965 are just under 4,000,000, though an error in Table 1 
shows the “low” estimate 20,000 higher than the “high” one. The experience of the 
1940’s was applied to the future both for births and migration, the “high” estimates 
being based on conditions of “the immediate post-war period” and the “low” esti- 
mates on the conditions of “the immediate post-war period” and the “low” estimates 
on the conditions of “the late 1940’s.” However, the crude birth rates in Table 8, used 
in calculating the “low” estimates, appear too low to characterize the period of the 
late 1940’s. The estimates for each future five-year date were prepared by applying 
assumed birth and migration rates to the base population. The base population was 
then carried forward to the next five-year date by use of survival ratios computed from 
the 1949-51 life table. 

The large increase in births during the last half of the 1940-1950 period will mean 
a substantial addition to the number of persons seeking employment, entering college, 
and establishing new homes between 1960 and 1965. Around 1965 the age group 25 to 
34 (born largely during the depression) will show a “shortage.” The reader might 
wonder in what degree the deficiency in this age group will be common throughout the 
United States by 1965. The future growth in the nonwhite population will depend 
upon the expected reduction in fertility resulting from continued iiving in an urban 
environment. Any major change in economic conditions would likely have immediate 
effect on all migration to the Chicago area, and such « contingency is the main hazard 
of this as of any other projection of city populations. 


A Projection of the Population of Colorado. Morris E.Garnsey and R. R. Pelz. University of 
Colorado Studies, Series in Economics, No. 2. Boulder, Colorado: University of Colorado 
Press, 1955. Pp. vii, 69. $1.50. 


Dennis H. Wrona, University of Toronto 


ais well-organized study presents a projection of the population of the state of 

Colorado and of special groups within the state from 1950 to 1¢8C. The total 
population enumerated at the 1950 census was 1,325,089; a 1980 population of 
2,202,215 is projected, an increase of 64 per cent over the thirty-year period. The 
laborious cohort-survival method of projection is employed. The authors selected it 
rather than a simpler method because they wished. to obtain figures for the total 
population at five-year intervals as well as age-sex breakdowns of the projected totals. 
The many computations required presumably deterred them from making more than 
a single series of projections, although they do present: alternative figures based on two 
differing estimates of net migration. While they clearly state their assumption of con- 
tinuing economic expansion in both the state and the nation, the hazards of relying 
on 4 single estimate of future fertility for a period as long as thirty years are perhaps 
insufficiently recognized. 

Historical and recent projected series for the entire United States are used as a 
guide for projecting vital rates in Colorado. The number of births per year is expected 
to increase as a result of an increase in the female population, but fertility rates are 
expected to decline steadily from their present high levels although remaining well 
above the low 1930-40 levels. The authors make their own projections of five-year 
mortality rates for the United States, anticipating a continuation of the rates of de- 
crease which have occurred since 1940. They assume that Colorado rates, slightly 
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higher in the past for cohorts under 50 and lower for older cohorts, will become iden- 
tical with the rates for the whole country by 1975. About 14 per cent of the total 
projected growth is expected to come from net migration into the state. Plausible 
assumptions are made about the age-sex distribution of migrants, enabling them to be 
added to the appropriate age cohorts at the end of each five-year interval. Two mi- 
gration estimates are. presented, the second allowing for substantially higher figures 
resulting from the establishment of the U. 8S. Air Force Academy near Colorado 
Springs and the possible development of an oil-shale industry in western Colorado. 

The most novel and interesting feature of the mocograph is an analysis of statistics 
of past migration to and from Colorado which relies chiefly on the state-of-birth re- 
ports of the Bureau of the Census. The percentage of Colorado’s population born 
outside of the state has exceeded that born in the state at every census since 1870, 
but has fallen consistently from 84 per cent in 1870 to about 53 per cent in 1940- 
1950, during which decade the figure remained almost stable. Movements into Colo- 
rado have greatly exceeded movements out of the state. The relationship between net 
migration by decade since 1870 and economic developments in Colorado is briefly 
indicated. In older states the ultimate attainment of a regular migration “definit” 
has been a sign of achieved economic maturity. Professors Garnsey and Pelz point 
out that since the war Colorado “has been one of the technological frontiers of the 
nation” and thus can expect for at least the next thirty years to draw more people to 
it than will be encouraged to depart in search of better opportunities elsewhere. In- 
migrants have come largely from the East while most of those leaving the state have 
gone farther west; Colorado’s migration pattern thus conforms to the general west- 
ward movement of population. This section of the monograph, particularly Table 7, 
provides a useful model for the statistical analysis of interstate migration data, not 
least of all because it highlights the paucity of data on the age-sex characteristics of 
migrants. 

Part II of the monograph presents projections of the !abor force, college enrollment, 
and first-grade school enrollment. A supplement by Judson B. Pearson presents en- 
rollment projections for the University of Colorado. Official projections of labor-force 
and college “participation rates” for the whole United States were used as a guide to 
estimate the corresponding rates for Coiorado. Medium and high estimates for both 
are given. Problems of definition and the many societal variables affecting education 
are fully discussed in connection with the college enrollment projections. The pro- 
jection of first-grade school enrollment is admittedly crude, being based solely on the 
estimated birth totals moved ahead six years. No allowances are made for mortality 
of children at ages 0-6, migration, failures to enrol, or for age differences at enroll- 
ment. The authors argue, however, that corrections for these factors tend to cancel 
one another out. They take advantage of the re-introduction of the projected birth 
figures to state more fully their assumptions about future levels of fertility, conceding, 
in what strikes this reviewer as an understatement, that “it is very possible .. . that 
the assumptions about the average fertility rates of Colorado females will be in error.” 


Mortality Trends in the State of Washington. Calvin F. Schmid, Earle H. MacCannell, and 
Maurice phan Arsdol, Jr. Seattle, Washington: Washington State Census Board, 1955. 
Pp. iii, 73. Paper. 


E. J. Brower, Dominion Bureau of Statistics 


ms study, second in a series published by the Washington State Census Board 
and similar to the senior author’s earlier study of Mortality Trends in the State of 
Minnesota, presents for Washington an integrated compilation of information of 
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practical usefulness in understanding population trends and in planning public health 
programs. Data from the United States Bureau of the Census, the National Office 
of Vital Statistics, and the Washington State Department of Public Health are unified 
in a logical and clear-cut presentation. Against the positive background of a life table 
the past and present trends of total mortality and the major causes of death are related 
to certain basic population characteristics. The study covers the forty year period 
from 1910, when Washington was admitted to the United States death registration 
area, to 1950. The 70-page booklet contains not only a concise summary of general 
mortality trends, and a comparative treatment of the ten leading causes of death, but 
in addition more than thirty-five causes are separately described in a series of some 
twenty-five tables. Other special charts and diagrams portray general morbidity 
trends, mortality trends by age, sex, rural and metropolitan distribution, and average 
future lifetime and life expectation for major components of the population based on 
current life tables. 

The forty-year period witnessed a total of 698,76i/ deaths, exclusive of stillbirths, 
or an annual mean of 17,469. Washington has always had a favorable mortality record 
in comparison with other states and the general trend for the whole period has of 
course been downward, from an age-adjusted rate of 17.5 in 1910 to 8.9 in 1950, with 
a high point of 15.3 during the influenza pandemic of ' 318. Although fluctuations for 
males and females show a marked relationship, male rates are higher than female rates 
for the entire period. For the younger age groups tre:ds have been characteristically 
downward whereas mortality among older persons bh. shown little decline and some- 
times an actual increase. In the triennial period 19-41 the expectation of life at 
birth for the total population of the state of Washi: ion was 66.1 years; for 1949-51 
the corresponding figure was 69.2 years. The most s gnificant feature of mortality in 
the state of Washington, as indeed almost everywhere else, nas been the rapid decline 
of mortality rates for communicable, infective, and parasitic diseases, following gen- 
eral improvement in living standards, expanded preventi\.; aud case-finding activi- 
ties of government health agencies, and the development of new therapeutic measures. 
A consequence of diminished mortality from these diseases in youth has been the 
growth in relative importance of the chronic and degenerative causes in later life, 
notably of heart disease. 

The authors draw interesting comparisons between the leading causes of death in 
1910 and in 1950. In 1910, the ten major causes of death had a combined crude death 
rate of 664.0 and comprised 66.2 per cent of the total death rate of 1,003.1 per 100,000 
population. By 1950, the combined crude death rate of the ten major causes had in- 
creased to 794.7 and constituted 84.3 per cent of the total death rate of 942.8 per 
100,000 population. In fact, the three highest ranking causes of death in 1950, diseases 
of the heart, malignant neoplasms, and vascular lesions of the central nervous system, 
accounted for 62.5 per cent of all mortality in the state. The importance of this in- 
crease in the death rate from the degenerative diseases is further shown in the state- 
ment that the three causes of death ranking highest in 1950 accounted for only 2.5 
per cent of all deaths in 1910. 


Psychometric Methods. Second Edition. J. P. Guilford. New York: McGraw-Hill, 1954. 
Pp. ix, 587. $8.50. 


M. Ciemens Jounnson, Educational Testing Service 


a second edition of Professor Guilford’s book surveying the areas of psychological 
measurement appears after an interval of eighteen years. References at the end of 
each chapter carry the reader through past and present developments, as of 1952. 
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Changes with respect to the first edition include new chapters on a general theory of 
measurement, psychophysical theory, a mathematical introduction, objective findings 
with respect to human judgment, and current psychological test theory. A general 
purpose is to emphasize the unity among areas of psychological measurement that 
had grown up somewhat independently. 

Material on psychological scaling is expanded and includes a summary of multi- 
dimensional scaling and its application. A number of problems with answers appear 
at the end of each chapter and an excellent set of tables is provided. To make room 
for new topics and examples, separate chapters on curve fitting, simple correlation, 
and multiple and partial correlation have been eliminated. The over-all result is an 
expanded presentation with the changes in emphasis which have been noted. 

‘The book is written for the nonmathematical student who possesses a good intro- 
duction to fundamental arithmetic and algebraic processes. Some familiarity with 
statistical symbols and elementary computations is also desirable. Many of the nec- 
essary arithmetical and statistical ideas are presented in a special chapter designed 
for this purpose. 

Nine chapters in a total of sixteen are devoted to techniques associated with meas- 
urement in traditional psychophysical research and with the many psychological- 
scaling methods. Somewhat greater emphasis is placed upon the latter procedures in 
which the end results are values on psychological rather than physical scales. Three 
chapters review problems of mental testing with a concentration upon basic quanti- 
tative ideas and more common techniques. Since mental testing tends to dominate 
the field of psychological measurement, the space allotment might have been mis- 
leading. This point, however, does receive special attention by the author. A final 
chapter describes and illustrates procedures employed in factor analysis. 

In view of the pwrpose and content of the book, problems of statistical inference 
are not surveyed to any great extent. A number of specific experiments are employed 
to illustrate psychometric methods. No clear description of principles of modern ex- 
perimental design js available. Significance tests of various types are rather consist- 
ently applied throughout the book and include a number of applications of chi-square 
and analysis of variance techniques. One interesting example involves the use of the 
analysis of variance in a problem employing ratings of seven individuals in five traits 
as given by three raters. Since the underlying assumptions are not treated in most 
applications of the tests nor in the review of statistical concepts, the reader may wish 
to consult the supplementary references for more rigorous treatments. 

In summary, the reviewer believes that this revision provides researchers and 
advanced students with a well-documented summary of developments in the various 
areas of psychological measurement. The effort to organize and provide logical con- 
tinuity to these areas is commendable. 


Mathematics and t'1e Social Sciences. International Social Science Bulletin, Vol. VI, 
No. 4. Paris: UNESCO, 1954. Pp. 581-685. $1.00. Paper. 


Rosert R. Busn, Harvard University 


HIS issue of a quarterly bulletin published by Unesco is devoted mainly to eight 
papers on how mathematics has been applied to social science problems. Each 
paper is a survey or review rather than a research report. 

In an introductory paper, the eminent social anthropolegist, C. Levi-Strauss, out- 
lines the history of mathematical thinking in the social sciences. He begins by de- 
seribing the great progress made in structural linguistics during the past 30 years. 
Although this field has become much more quantitative and precise in the last 10 
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years or so, little mathematics has actually been used. Communication engineering 
hac influenced many researchers in linguistics, but one could hardly say that mathe- 
matics has become a major tool of those investigators. Mr. Levi-Strauss is very opti- 
mistic about the future, however. 

Much of the progress being made in applying mathematics to social science, ac- 
cording to Levi-Strauss, has resulted from a shift of emphasis from “quantitative” 
mathematics (e.g., classical analysis) to “qualitative” mathematics (e.g., topology). 
Examples of the new approach are A. Weyl’s treatment of the rules of marriage and 
descent, and the theory of games. Concerning the latter, Levi-Strauss notes that the 
mathematics has become more sophisticated and the problems handled more con- 
crete. In conclusion, the author makes a strong plea for more mathematical training 
of young social scientists. But he is disturbed because present teachers and adminis- 
trators are “intellectually ill-equipped to plan and carry out” the needed revisions of 
curricula. 

In a paper called “Probability and the Social Sciences,” B. de Finetti comments on 
the history of probability theory and how probability notions are uaturally involved 
in information theory, game theory, and decision theory. In an appendix, he gives 
several “definitions” of probability without mentioning the modern axiomatic treat- 
ment of the subject. The paper offers nothing new or interesting to an expert and 
nothing very understandable to a non-expert. 

Colin Cherry, an applied mathematician and communications engineer, writes 
“On Mathematics of Social Communications.” He discusses the use of analogies in 
science and with caution describes how telecommunication systems can be used as 
models for human interaction problems. 

An outstanding social psychologist, Leon Festinger, writes on mathematics and 
sociology. He relates how game theory has suggested several experiments in sociology, 
but notes that the mathematical theory itself is inapplicable to most sociological 
problems— it is a normative theory rather than a behavioral one. In the same way, 
graph theory has led to valuable experimental work on group structure, according to 
Festinger. Two attempts to develop mathematical theory of actual social behavior 
are then described and criticized. The first, by Hays and Bush, has been applied only 
to a highly restricted class of phenomena; as a result, Festinger is skeptical about this 
approach. The second attempt, by Simon, applies to a much broader class of social 
problems but has led to nothing new, says Festinger. Nevertheless, he considers this 
work of Simon to be “to date, the most promising use of mathematics in social psy- 
chology.” 

A. Tustin and R. C. Booton, Jr., present a program for economic prediction and 
stabilization. In general terms they discuss the problems of specifying the functional 
relations among economic variables and in estimating parameters. Most work has 
been done, they say, on linear systems. The use of analogue computers is discussed. 

To readers who, like this reviewer, know very little mathematical economics, G. 
Tintner’s paper will be enlightening. He characterizes and illustrates four classes of 
econometric models. For each type of mcdel he lists the mathematical methods that 
have been used. In discussing problems in economic statistics, Tintner points out that 
classical statistical models are seldom appropriate for describing the sequential data 
of economics. More advanced methods in time-series analysis are required here. 
(Similar comments would apply to most psychological data.) 

An elementary but extremely lucid discussion of sampling theory is given by P. 
Thionet. Without actually using any mathematics, this writer describes and criticizes 
the various mathematical methods in sampling. Finally, he describes the major pro- 
cedures used by public opinion polls and discusses the influence (or lack of it) of 
mathematical methods on these polls. 
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Elbridge Sibley, executive secretary of the Social Science Research Council, gives 
a detailed historical account of how his organization has promoted the mathematical 
training of social scientists. This work, which begen in 1932, has become a large-scale 
activity of the Council. After Sibley’s history was written, two more summer insti- 
tutes were held in 1955 and two more in 1957 are now being planned. Since about 
1951, most of the Council’s mathematical training activity has been directed by 
William G. Madow. Work of four other organizations in this area is also described. 

The use of mathematics in psychology is not discussed in any of the several papers. 
This omission was apparently intentional, but it may give some readers a false im- 
pression. Considerable mathematics has been used in psychological testing, psycho- 
physics, and in the psychology of learning. Altogether, psychology has been as big a 
consumer of mathematica] methods as economics has been. By comparison, sociology, 
anthropology, and political science have been only slightly influenced by mathemat- 
ics other than descriptive statistics. 


Small Groups: Studies in Social Interaction. Paul Hare, Edgar F. Borgatia and Robert F 
Bales, Editors. New York: Alfred A. Knopf, 1955. Pp. xvi, 666. $6.50. 


Leo Katz, Michigan State University 


fee is a collection of reprinted materials, mostly from recent journals, focusing on 
social interaction in small groups. An extensive annotated bibliography lists 584 
publications, including 55 which are reprinted in <reater detail in the text. The anno- 
tations by the editors are informative though in capsule form, ranging in length from 
a simple sentence to a substantial paragraph. The editors partially overcome the bias 
toward the selection of briefer papers by the devices of abridging longer ones and of 
representing important books in this field by a series of excerpts from each. 

Part I, on historical and theoretical background, convains four selections on the 
early theory, five on early research and eleven on current theory. Part II is devoted 
to recent research with emphasis on the individual in social situations. Three papers 
in this section are concerned with the influence of interaction with others on individual 
mental activities such as learning, problem-solving, and evaluating; four bear on the 
individual’s perception of thoughts and feelings of the others, and the relation of this 
to his position in the group: eight deal with constancy and change in individual overt 
behavior in relation to the group. 

Part III, treating the group asa system of social interaction, has five papers on the 
communication network, three on equilibrium problems in the network, five on 
specialization and role differentiation, and seven papers on the perennial problem of 
leadership. Part IV is the bibliography mentioned earlier. 

The editors, in a short preface and four very brief introductions ta the separate 
parts, make some attempt to indicate relationships among the selections included. 
In many cases, they resort to presenting papers in order of publication. 


A Social Profile of Detroit, 1954: A Report of the Detroit Area Study of the University of 
Michigan. Ann Arbor: The University of Michigan, 1954. Pp. vi, 29. $1.00. 


Davip Sotomon, McGill University 


His booklet presents facts obtained in the third annual survey conducted by the 
Detroit Area Study of the University of Michigan. The primary purpose of the 
Study is “to collect and interpret basic information about the social and economic 
characceristics of the Detroit Area population” (p. 1). Most of the data were obtained 
through interviews early in 1954 of “a representative cross-section” consisting of 764 
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private households drawn from 337 blocks and comprising 1,668 adults. If necessary 
as many as ten call-backs vere made to reach “the individual to be interviewed” in 
each home. (We are not tolu now this individual was selected.) Interviews were com- 
pleted in 86 per cent of the households listed. The various chapters report on family 
income, television ownership, viewing of an educational television program, the 
residential distribution of migrants, and some attitudes toward employees and em- 
ployment in public compared with private organizations. 

Median family income in 1953 was $5,700, an increase of 24 per cent over 1951. 
While the incomes of families whose heads were “managers, officials, and proprietors,” 
increased by 50 per cent, those whose heads were professionals experienced almost no 
increase at all. 

Eighty-seven per cent of families interviewed owned television mates The principal 
variations are associated with income and occupation. Ownership rates for the lowest 
and highest income groups were 61 and 92 percent respectively. Comparing occupa- 
tional groups, “professional, technical and kindred workers” have the lowest propor- 
tion of owners, about 75 percent, while “managers, officials and proprietors” have the 
highest, 98 percent. 

A brief chapter shows the proportions of respondents who had seen the Univeristy 
of Michigan Television Hour. Thus “43 per cent of all adult Detroit Area residents 
had seen the “educational program at sometime,” “27 per cent of the cross-section 
sample interviewed . . . reported that they had seen the most recent series of courses,” 
and of these “24 per cent . . . reported seeing the show once a month or more often.” 

Data on the residential distribution of migrants to the Detroit Area were obtained 
from interviews in 1,157 homes in early 1953. Recent migrants tend to live close to the 
center of the city, while native Detroiters and migrants from other parts of Michigan 
tend to be farther than 8 miles out. 

Readers who happen to be in the public service may be somewhat cheered to know 
that, at least in the Detroit Area, except for the highest-income group, the replies of 
over half the respondents indicate attitudes favorable to government employment 
and employees. 

The authors have been to some pains to keep the presentation simple and unen- 
cumbered by detailed explanations or footnotes. This is not altogether a disadvantage, 
but the lack of “technical details of sample selection” promised (p. 4) but not given in 
the appendix, leaves some questions unanswered. First, one wonders why the table 
totals are so inconsistent, particularly in the chapter on migrants. Second, and of 
greater interest, one wonders why the replies of apparently preselected individuals, 
one to each home or household in the sample, can be taken to represent “all adults in 
the Detroit Area,” as apparently they are in both the chapter on the University of 
Michigan Television Hour and that on migrants. 


Urban Traffic, A Function of Land Use. R. B. Mitchell and C. Rapkin. New York: Colum- 
bia University Press, 1954. Pp. xviii, 226. $5.00 


Davin A. Wauuace, Redevelopment Authority of the City of Philadelphia 


1TY planning, in its present level of development, has pretentions both as an art 
C and as a science. One of the major goals of city planning as a science is to be able 
to predict the traffic implications of various land use distributions and ultimately to 
control the interaction between land use and traffic. Urban Traffic is a major effort to 
provide a more scientific base for city planning policy and decision making. By R. B. 
Mitchell, Executive Director of the Philadelphia Urban Traffic and Transportation 
Board and Professor of City Planning at the University of Pennsylvania, and Chester 
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Rapkin, formerly of Columbia University’s Institute of Urban Land Use and pres- 
ently Research Associate Professor at the University of Pennsylvania’s Institute for 
Urban Studies, it was written with the ojective of creating a methodological frame- 
work for understanding the interaction of urban land use and traffic movement. 

For over two-thirds of the book, the authors examine concepts, technics for traffic 
analysis, and existing traffic data. They start with the concept of establishments as 
the simplest unit for examining the movement of persons, goods and materials, and 
move on to groups of units which they call systems of activities. This is a notion 
borrowed from the sociological theories of Parsons and Merton. Separate individuals, 
and groups of individuals organized as households and firms, use establishments as the 
base for their activities; these groups are pa”ts of larger systems of activities. Wheth- 
er movements are studied in the mass or as individual actions, they must be con- 
sidered in the light of the larger systems, and the social roles played by the individuals 
and groups of individuals. Psychological research is emphasized as necessary to under- 
standing travel motivations, the factors influencing the selection of routes and desti- 
nations, and the roles in which people travel. 

The analysis of interacting systems of person-movements and goods-movement its 
further suggested as a way of understanding factors influencing change between land 
use and traffic. 

While this rather rigorously developed group of definitions and classifications is not 
a theoretical framework within which further empirical research can be carried out, 
it does suggest the direction toward development of such a theoretical framework. It 
is reminiscent both of the early formulations of R. M. Haig, particularly those con- 
cerning the struggle for location among the competing and conflicting land uses in 
attempting to minimize the frictions of space, as well as later theoretical locational 
analyses. The authors stress the way movement and communication requirements ac- 
cent accessability to central locations and specific facilities. The need for accessability 
reflects the functional requirement of the activity and the activity’s relation to the 
spatial distribution of other activities. Shifts in land uses become the reflection of 
changing functional requirements, growth, and the relation to other activities. 

The design and distribution of land uses together with their interconnecting ele- 
ments of highways, railroads, etc., particularly in urbanized areas, must eventually 
lean on the kind of theoretical framework suggested in this book. The relatively in- 
tuitive and segmented approach now used by planners and traffic engineers in analyz- 
ing various activities on the land vis a vis traffic will be slowly supplanted by the kind 
of analysis suggested here. 

Urban Traffic is an important contribution to the gap that exists between present 
theory and practice. Empirical research carried out within such a framework as this 
book suggests will add cumulatively to our body of knowledge and theory, rather 
than provide only short range answers to limited analyses. 


The Worker Speaks His Mind on Company and Union. Theodore V. Purcell. Cambridge: 
Harvard University Press, 1953. Pp. xix, 344; $6.00. 


Rosert L. Kaun: University of Michigan 


URCELL has written an important and unusual book. Not the least of its interest 

lies in the area of methodology. Methodological altercations between proponents 
of qualitative and quantitative techniques are common enough; insightful and care- 
ful combinations of statistical data and case-study intensiveness are far more rare. 
This combination is well exemplified in The Worker Speaks His Mind on Company 
and Union. 

Without belittling Purcell’s attention to tables of random numbers and analysis of 
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variance, however, I believe that the major value of this book stems from the fact 
that it presents in detail the observations, insights, and many of the verbatim re- 
sponses obtained by a sensitive and perceptive social scientist who lived for 18 
months in a plant community. During that year and a half in an area of concentrated 
population adjoining a major company, depending on it and depended on by it, 
Purcell acquired that acceptance and intimate knowledge for which no short-term 
technique can provide a substitute. During that time also, he himself interviewed 
almost 400 employees of the meat-packing plant (Swift and Company) around which 
the community was built. The results of his work are best reflected in the interview 
protocols from which he quotes at length throughout the book. Here are the aspira- 
tions and accomplishments, the satisfactions and deprivations of a population of in- 
dustrial workers earning an average annual wage of $3,154 ‘for men who worked 52 
weeks in 1949). Through Purcell’s interviews, the men and women of Bronzeville 
and Back-of-the-Yards speak, and it is a rare reader who will listen to them without 
developing a deeper understanding of industrial problems and what can be properly 
called industrial psychology. 

The organizing idea around which the book is written is dual allegiance, the no- 
tion that the attitudes and loyalties of the American worker toward union and man- 
agement are complementary rather than in conflict, and that the worker does not 
experience this duality as tension-provoking. Purcell’s major predictions were (1) 
that dual allegiance would exist, or more specifically, that workers would express 
favorable attitudes toward both the company and the union; (2) that workers would 
perceive both company and union as essential to the satisfaction of their needs, and 
would see either one as unable to provide those satisfactions by itself; (3) that work- 
ers would feel no pull to resolve this duality in favor of company or union. Purcell 
feels that his data provide affirmation of all three hypotheses, although not without 
some qualification. 

On the whole, I do not quarrel with this interpretation of the data. The fact that 
73 per cent of all workers, and a substantial majority of each sub-group of them, ex- 
press favorable attitudes toward both company and union is evidence enough of the 
first point, and an important social datum in itself. 

It is plausible also that workers should see both company and union as required 
for the satisfaction of their needs, and it is almost certain that no worker could con- 
ceive of the union performing such functions in the absence of the company. A local 
union, after all, comes into organizational existence only when a company (or some 
other formal organization for accomplishing work) already exists. The workers might, 
of course, have viewed the company as an adequate need-satisfier without the modi- 
fying influence of the union. That they did not is perhaps the more important finding. 

Finally, there is the hypothesis about the stability of dual allegiance. The data 
here are least satisfying, an unavoidable consequence of a “one-time” study, how- 
ever intensive. Purcell finds only a small minority of workers who feel tense about 
their dual loyalties and experience a need to resolve them in favor of company or 
union. This is a believable situation during a period of labor-management harmory, 
but what of a time of dispute or strike? It seems to me that dual allegiance is feasible 
at some times, but not at others. It may be a kind of equilibrium or steady state 
which is subject to periodic disruption over issues which make the demands of the 
two institutions no longer compatible. When a strike-bound management urges its 
workers to return to their jobs and a striking union exhorts them not to cross a 
picket-line, what becomes of dual allegiance? A possible answer to this question is 
that even in such extreme situations dual allegiance persists, in the sense that the 
majority of workers still favor the continued existence of both company and union. 
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This is a more restrictive definition than Purcell intends, however. Data collected 
during a time of stress, or better still data for the same population collected at dif- 
ferent times and in different circumstances with respect to union-management rela- 
tions would illuminate further this area. 

The existence of favorable attitudes toward both company and union confirms 
Pureell’s major hypothesis for this population of packing-house workers, but there 
are a number of other well-documented findings which are of comparable substantive 
interest. For example, there is the finding that the majority of foremen are favorable 
toward the union, a fact which contrasts sharply with the usual stereotype of the in- 
dustrial foreman. The aspirations of the workers also come through very clearly. They 
include a “boss who does not supervise too much, but lets me alone,” and a job with 
some variety. (Specialization and division of labor in the packing industry have been 
carried to the point where many of the positions appear to offer only unrelieved 
monotony, although some workers say that they find variety and challenge even in 
the most fractionated of the jobs.) 

Security and mobility are two other specific needs which the Swift workers men- 
tion often. The first of these appears to be well fulfilled in this company, with the 
general conditions of full employment and the specific provision for plant-wide sen- 
iority frequently cited as an additional reassurance. Mobility is a need less fulfilled 
for these workers, and their frustration in this area reveals itself in two main ~ ays. 
Some workers express their resentment openly, criticizing management for the lack 
of offered opportunities. This is especially true of the Negroes, who complain explic- 
itly of their exclusion from superviscty jobs. Other workers seem to have renounced 
their own aspirations, or rather to have displaced them upon their children. Thus, 
many who state that they have “done all right” and like the company, add proudly 
“but my boy will not come to the yards.” 

Perhaps we may mention one additional finding. Purcell’s study provides evidence 
against the too-pat equating of foreman with company, or steward with union. The 
Swift workers appear to have little difficulty in discriminating between individuals 
and organizations. They can be pro-company while complaining bitterly about their 
foreman, or pro-union while expressing dissatisfaction with an ineffective steward. 

In summary, this is a successful and an important study, and the correctness of 
the sampling procedures, the intensive interviewing, and the careful statistical analy- 
sis encourage confidence in the findings as representative of the Swift plant commun- 
ity of a few years past. The conclusions are well worth testing in other industries and 
other union-management climates. It is encouraging to know that Purcell has already 
begun such research in at least two other mid-western cities. 
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AMERICAN STATISTICAL ASSOCIATION 
REPORT OF THE BOARD OF DIRECTORS, 1955 


In July of this year, the Association suffered the loss of its Executive Director, Samuel 
Weiss, who died on July 23. Mr. Weiss was elected Secretary-Treasurer of ASA in April, 
1949. When he took office, the Association’s cumulative deficit was over seven thousand 
dollars and the members numbered about 4,200. At the time of his death the Association’s 
surplus had reached thirty thousand dollars and the membership had reached 5,400. 
During his six years of service, ASA underwent a healthy expansion o/ its activities in a 
number of areas. The Board and Council have appreved a resolution to be published in 
the Journal and The American Statistician. 

Donald C. Riley, Deptuty Director of the Office of Statistical Standards in the Bureau 
of the Budget, has agreed to serve as Secretary-Treasurer, succeeding Mr. Weiss, Recom- 
mended by a special committee that was appointed by President Watkins, and consisting 
of Samuel 8. Wilks, Raymond T. Bowman and Martin R. Gainsbrugh, Mr. Riley was 
nominated by the Board, and elected by the Council, unanimously. 

It is the Boerd’s unpleasant duty to report that the Association wi!l probably have a 
deficit of about 5,000 dollars for the year 1955. This is due almost entirely to the rise in 
the cost of printing, the increase in the size of the Journal and number of copies printed, 
and a Joss or the publication of the “Proceedings of the Bueiness and Economic Statistics 
Section.” More details on finances and membership figures are reported by the Secretary- 
Treasurer. 


The Journal, and The American Statistician 


The Journal for 1955 increased about 300 pages over the 1954 volume and has more 
than doubled in size since 1949. Under the editorship of W. Allen Wallis, a growing number 


of articles of high caliber in the various fields of statistical application and methodology 
are being submitted to the Journal. The publication of more and varied articles has raised 
even further the interest of statisticians everywhere in the Journal. This interest is already 
reflected to a certain degree in the number of new members. However, the increase in size 
and number of copies, coupled with a sharp rise in printing costs, have caused the expense 
of the Journal to exceed the amount budgeted. The problems arising from growth in both 
size and cost per page are being studied by the Board and will be reported on in the future 

The American Statistician has also increased in the number of pages in the 1955 volume 
Again, increased material acceptable for publication has resulted in larger individual 
issues. The expansion of the special departments, in addition to the articles, have made 
The American Statistician of greater interest to the membership and subscribers. 


Other F ublications 


At the end of 1954 “Statistical Problems of the Kinsey Report” was published. After 
a full year, about one-third the number printed have been sold. As an inventory item, 
the sales will provide a small, but steady income to enable the Association to underwrite 
other monographs. The Report has been acclaimed by a number of critics (for example, 
“A Symposium on the Cochran-Mosteller-Tukey Report” in the September, 1955 issue 
of the Journal) and is already being used in some classrooms as a course book and reference 
work, 

The “Proceedings of the Business and Economic Statistics Section,” published early in 
1955, contains the papers presented at the sessions of the Section in Montreal in Sep- 
tember, 1954. It was hoped that this could become a self-supporting yearly project. Despite 
the fact that the Proceedings were advertised several times, sales have been below esti- 
mates. 

Both monographs are available to ASA members at a substantially lower price than to 
non-members. This is part of the Association’s plans to provide a more widespread pub- 
lications program. 
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Committees 


1955 saw the creation of a new Committee to Investigate Statistics as Evidence. The 
Committee, under the chairmanship of John Tukey, was appointed in response to recom- 
mendations based on the fact that many lawyers fail to recognize the validity of statistics 
as evidence. The Committee will report to the Board in the future. 

Mr. Tukey also chairs an ad hoc Co:amittee on Redistricting which was authorized at 
the fall meeting of the Board. For some time, the Board has been requested to look into the 
possibilities of rearranging the districts of the Association from at least two standpoints: 


a) Consideration of size of chapters, and 
b) Possible realignment of chapters within districts on the basis of sectional interest. 


The Committee was authorized to prepare material for a genera] mailing to the mem- 
bership, including a statement of principles, geographic considerations and appropriate 
constitutional citations. 


New Chapters 


Five new chapters were granted charters by the Board in 1955. A welcome is extended 
to the following groups: 


1. Buffalo-Niagara Chapter in Western New York State 

2. Rochester, New York 

3. Cincinnati—Rechartered after a lapse of about three years 
4. North Texas, in Dallas and vicinity 

5. Montreal—the first Canadian chapter 


The number of active chapters is now 37. Inquiries from other areas indicates further 
growth in local activity. 


Annual and Regional Meetings 


As a result of a survey conducted among a sample of ASA members on Annual Meeting 
times, the 1956 and 1957 dates have been set for September. One out of every five mem- 
bers was sent a ballot carrying three choices—Christmas week, September (after Labor 
Day) and late Spring. Of the ballots returned, about fifty per cent expressed first prefer- 
ence for September. The future meeting dates are as follows: 


1956—Detroit, Sept. 7-10 (joint with American Sociological Society) 
1957—Atlantic City, Sept. 10-13 (joint with Institute of Mathematical Statistics) 
1958—Chicago, Christmas Week (joint with Allied Social Science Associations) 


A regional meeting was held in Philadelphia in June, cosponsored by the Wharton 
School of the University of Pennsylvania and the Business and Economic Statistics Sec- 
tion. A Proceedings volume of the meeting was published and distributed to all attendees. 

The Section on Physical and Engineering Sciences held. a meeting in conjunction with 
the centennial celebration of the School of Engineering of New York University, in May. 

The Social Statistics Section joined the Bureau of Labor Statistics in sponsoring a con- 
ference on manpower and employment statistics, at the University of Wisconsin, in July, 


Other Activities 


An exchange arrangement has been made between the American Sociological Society and 
ASA to permit members of each group to subscribe to the other’s Journal at a reduced 
rate. The American Socioiogical Review is now available to ASA members for $3.75 per 


year. 

President Watkins appointed, with the approval of the Council, Donald C. Riley as 
ASA Representative to the International Statistical Institute and the Inter-American 
Statistical Institute. 1956 Committees and Representatives will appear in the February, 
1956 American Statistician. 
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SECRETARY-TREASURER’S REPORT, 1955 


1955 has set a record for the number of new members joining the Association. 762 per- 
sons applied for membership in 1955. Added to this figure are 29 others who reinstated 
their membership this year. By the end of 1955 the names of about 450 persons were re- 
moved from the lists because of resignation, nonpayment of dues or death. The Associa- 
tion started 1955 with approximately 5,150 members. The net increase of about 340 mem- 
bers brings the total at the beginning of 1956 to just about 5,500 members. 

It is interesting to note that about 500 members, or one in eleven, are residents of foreign 
countries. The increase in foreign membership is partly due to the reduction in dues for 
residents outside North America. This reduction has made it easier for them to obtain 
American currency. 

The list of subscribers to the Journal of the ASA (libraries, business firms, governmental 
agencies, etc.) has been steadily increasing over the past few years. Since 1952 the numbers 
of subscribers are as follows: 

1952—1,248 
1953—1,356 
1954—1,437 
1955—1,482 


However, the financial] picture for 1955 has not been as bright. The estimated total 
deficit for the full year will approximate $5,000. Total income was budgeted at 
$65,810. Actual income will reach about $68,500. The increase was caused primarily by 
more new members, more advertising, and « successful exhibits show at the Annual Meet- 
ing. 

The total expense was budgeted at $63,870. Actually, expenses will come to about 
$73,500. As explained in the Board of Directors report, increase in the size of publications 
and rising costs in printing are responsible for the major part of this excess over budget. 
This has caused a deficit for the first time in six years. Due to the 1955 deficit, the cumu- 
lative surplus wil! be reduced from about $30,000 to $25,000. 

Your Secretary-Treasurer appreciates the confidence that has been placed in him, but 
will need all of the assistance and advice that members of the Association can give if he 
is to continue to develop the policies and program so ably carried on by Mr. Weiss. 


April 25, 1956 
To the Board of Directors of 
the American Statistical Association: 

I have examined the accompanying financial statements of the American Statistical 
Association relating to the year ended December 31, 1955. My examination was made in 
accordance with generally accepted auditing standards and, accordingly, included such 
tests of the accounting records and other auditing procedures as were considered necessary 
in the circumstances. 


The recorded cash receipts for the year were traced to the deposits shown on the bank 
statements, and the amounts for dues and subscriptions were tested against the member- 
ship and subscription records. The paid checks were inspected and related vouchers tested 
in support of cash disbursements for the year. The bank balances were reconciled with 
certificates obtained directly from the depositaries, and the cash on hand was counted and 
reconciled with the books during the course of the examination. I did not check the mem- 
bership and subscription records in detail or make any independent verification of the 
inventory of back numbers of Journals, the office records of which are based, in part, on 
data assembled in prior years. 

In accordance with a resolution passed by the Board of Directors, the expense incurred 
in publishing a directory, distribution to the membership beginning in 1954, is to be 
spread over a three-year period although such costs would appear to be applicable pri- 
marily to the year 1954. The accounts for the year ended December 31, 1955, reflect a 
charge of $1,787.11, representing the allocated portion of the directory expense applicable 
to that period. 
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In my opinion, the accompanying statements present fairly the financial position of the 
American Statistical Association on December 31, 1955, and the results of its operations 
for the year then ended, in accordance with generally accepted accounting principles, 
except as mentioned in the previous paragraph, applied on a basis consistent with that of 
the preceding year. 

James C, Jester 
Certified Public Accountant 
Tower Building, Washington 5, D.C. 


AMERICAN STATISTICAL ASSOCIATION 


BALANCE SHEET 


Decernber 31, 


Cash in banks and on hand 

Accounts receivable 

Investment in United States Savings Bonds, 
Series G, due 1962, at cost 

Inventory of old Journals, at approximate cost............. 

Inventory of Monograph on Acceptance Sampling, 


Inventory of Emblems, at cost 


Inventory of Monograph on Kinsey Report, at cost........ 


Furniture and fixtures, at cost less accumulated 
depreciation 
Deferred charges: 
Deferred Membership Directory expense 


Accounts payable 


Deferred income (collections applicable to 
subsequent years): 


Net worth: 
Life Membership reserve 
Surplus, per statement... 


Total net worth 


Total Liabilities and Net Worth 


1955 


1954 
$59,741. 
894. 


3,100. 
2,360. 


45. 

361. 
4,482. 
1,908 . 9¢ 


3,604. 
1,136. 





$73 , 290.90 


$77 ,636. 





$16,447.46 


$17 ,488 .68 





$20,908.50 
6,836.05 
647.34 


$18,987.00 
6,535.80 
466.84 





$28 ,391.89 


$25 , 989 .64 





$ 3,641.73 
24 ,809 .82 


$ 3,796.98 
30,360.81 





$28,451.55 


$34,157.79 





$73,290.90 


$77 ,636.11 
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AMERICAN SraTIsTICAL ASSOCIATION 
STATEMENT OF INCOME AND SurpPius Accounts 


Year ended December 81, 
1956 1954 


Income: 


Dues—Current year $42,192.90 $40,232.75 
—Prior year 160.00 155.00 
Subscriptions—Journal ‘ 10,869.22 
—American Statistician ‘ 462.75 
Advertising—. Journal & ican , 1,319.82 
—American Statistician : 125.00 
Sales—Journal ; 2,870.95 
—American Statistician ‘ 146.95 
—Kinsey Report Monograph ‘ — 
—Acceptance Sampling Monograph : 175.00 
-—Business and Economic Section Proceedings * oo 
—Emblems, less cost of sales i 26.75 
—Biometrics j — 
7.41 
Mailing list income 92 
Interest income 1,392.69 
Annual meeting 2,271.55 84 
Life membership income 155.25 _ 
Miscellaneous income 163 .39 20.43 





Total income $68,956.80 $59,209. 





Expenses: 
Salaries $15,476.78 $14,368. 
Publications—Schedule I 43,419. 27 ,679. 
ee 1,430.75 644. 
2,450. 2,400. 
735. 761. 
2,684.86 1,692. 
2,327. 2,106. 
919. 1,128. 
Accounting services 970. 970. 
Committee expense 1,917. 1,191. 
Annual meeting expense . 996. 
Miscellaneous expenses—Schedule I. . ; 1,818. 





Total expenses $74,507.79 $55,757. 





Excess of (expenses) over income for the year $( 5,550.99) $ 3,451. 
Add: Surplus account at beginning of year 81 26,908. 





Surplus account at end of year $24,809.82 $30,360. 
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Schedule I 


AMERICAN STATISTICAL AvSOCIATION 


Year ended December 31, 
19565 1954 


Publications: 
Journal— Printing $25,559.08 $16,994. 
—Abstracts ; 
—Editorial expense 
—Cost of old Journals 
—Delivery charges.......... 
—Storage charges.......... 





American Statistician. . 

Kinsey Report Monograph 

Acceptance Sampling Monograph 

Business and Economic Section Proceedings 

Membership Directory, allocated expense 
less sales 





Total publications $27 ,679. 





Miscellaneous Expenses: 


Life Membership expense. 
Depreciation 

Dues to other organizations 
Bank charges 


Repairs and maintenance 
Insurance 





$ 1,818.17 
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STATISTICAL THEORY IN RESEARCH 


By R. L. ANDERSON, University of North Carolina, and T. A, BANCROFT, 
Iowa State College. 399 pages, $7.00 
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INTRODUCTION TO THE THEORY OF STATISTICS 


By ALEXANDER MOOD, General Analysis Corporation, Santa Monica, 
California. 431 pages, $6.50 


A cane fer 0 cnnderd eS Se i x. isi fom 

book deals first wi necessary concepts and probability 

proceeds with the distribution and sampling theory. It also explores the two major 

problems of scientific inference: the estimation of quantities and the testing of 

potheses. Applications of the theory are amply illustrated, particularly in problems 
student solution. 


ELEMENTARY STATISTICS 
For Students of Social Science and Business 


rH R. CLAY SPROWLS, University of California, Los Angeles. 422 pages, 
5.50 


A basic, elementary text for all social science and liberal arts students. It deals 
primarily with the formulation of decisions based upon incomplete information. It 
considers statistics important as inference, not description. Emphasis is on principles 
of inference, the ideas of hypotheses, risks of error, and the evaluation of t risks 
in terms of the operating characteristics of a statistical test. 


BASIC STATISTICAL CONCEPTS 
By JOE K. ADAMS, Bryn Mawr College, 316 pages, $5.50 


This new book develops some basic mathematical-logical concepts of statistics, It 
provides an understanding of the | age used in mathematical statistics, includ- 
ing that of elementary calculus. The of statistical inference is presented usin 
finite populations, making it possible for the beginning student to work throu 
the basic concepts without skipping any of the mathematics involved, Also included 
are the most frequently used mathematical models, both discrete and continuous 
with numerous applications, 
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STATISTICAL METHODS 
As Applied to Economic and Business Data 


READY FOR 
by WILLIAM A. NEISWANGER 
FALL CLASSES Y "University of Tlinole 


A new edition of a text that has been 
a leader in its field since publication, 
a text that has been thoroughly tested 
in classrooms across the country. It 
now features managerial statistics, 
with emphasis on analysis, interpreta- 
tion and application to economics and 
business administration problems. 





60 Fitth Ave. New York 11 
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TABLES OF THE CUMULATIVE 


BINOMIAL PROBABILITY DISTRIBUTION 


By the STAFF OF THE COMPUTATION LABORATORY OF 
HARVARD UNIVERSITY 


These tables give the binomial distribution for n = 1(1}50(2) 100(10) 200(20)500(S0) 
1000 and for p= 0.01(0.01)0.50 with 10 additional commen fractions, 1/16, 1/12, 
1/8, 1/6, 3/16, 5/16, 1/3, 3/8, 5/12, and 7/16. Cumulative probabilities are shown to 
5 decimals, and an appendix gives values of login! to 10 decimals for n = 1(1) 1199. 
“Perhaps the most valuable feature of the Tables is a 28-page introductory section on 
applications of the binomial distribution by Frederick Mosteller. . . . Probably the best 
and most comprehensive treatment of statistical applications of the binomial distribu- 
tion available.”—Journal of the American Statistical Association. $8.00 


TABLES OF THE FUNCTION arc sin z 


By the STAFF OF THE COMPUTATION LABORATORY OF 
HARVARD UNIVERSITY 


The first adequate presentation of tables of the inverse sine in the complex domain. 
Both the argument and the function are given in cartesian form; six decimal places 
are provided. An introduction describes the properties of the ‘function, the composition 
of the tables, and methods for interpolation to within an error of 2 x 10~. Will meet 
a real demand in physics, engineering, and applied mathematics. $12.50 


Through your bookseller, or from 2 
HARVARD UNIVERSITY PRESS 
Cambridge 38, Massachusetts 
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for STATISTICIANS and MATHEMATICIANS 
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Dr. Lincoln Hanson, Research Personne! OMcor 
7100 Connecticut Avenue, Chevy Chase 15, Maryland 
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How to become a satisfied mathematician 
...Up to $13,000 in New York City! 


COMMUNICATIONS SYSTEMS ANALYST .. . 


Experienced in analytical evaluation of communications systems through 
modern theory and statistical analysis. You should have strong 
mathematical and analytical background. Familiarity with 

practical communications systems design, 

application and limitations desirable. 

Here are the challenges, the rewards—financial and professional— 

that a satisfied, successful mathematician requires. 

You'll find them in the growing New York City 

engineering operation of an electronics pioneer and leader. 


To arrange confidential interview, send resume to 
AMERICAN STATISTICAL ASSN., 
1757 K St., N.W., Washington 6, D.C. Attn: Dept. 100 














Copies Sill Available 


STATISTICAL PROBLEMS OF THE KINSEY REPORT 


by Cochran, Mosteller and Tukey, Contents include: Statistical Prob- 
lems of the Kinsey Report; Discussion of Comments by Selected Tech- 
nical Reviewers; Comparison With Other Studies; Proposed Further 
Work; Probability Sampling Considerations; The Interview and The 
Office; Desirable Accuracy; Principles of Sampling. 331 pages, in blue 
buckram. Price: $3.00 to members of ASA; $5.00 to others. 


Order your copy directly from the American Statistical Association, 
1757 K Street, N.W., Washington 6, D.C. 
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STATISTICAL 
YEARBOOK, 1955 


A total of one hundred and forty-eight 
countries and territories have supplied 
information to make this seventh in the 
series of Yearbooks perhaps the most 
complete yet published. There are, for 
the first time, statistical series from the 
USSR. Included are tables on population, 
agriculture, fishing, production, internal 
trade, national income, social statistics, 
education, and culture. New tables have 
been added for world trade, wholesale 
prices of selected commodities, cinemas, 
and television transmitting stations. 
= There is an English and French index. A 





DEMOGRAPHIC 
YEARBOOK, 1955 


This seventh issue of the Yearbook pro- 
vides a comprehensive reference volume 
for international population-census data 
relating to the decade ending 1954. One- 
third of the volume is devoted to new 
and revised distributions on world, 
regional, national, subnational, city popu- 
lations, urban and rural populations, 
population distributed by size of local- 
ities, marital status, literacy, and meas 
ures of fertility. Included are tables on 
births, stillbirths, marriages, and divorces. 
A United Nations publication. 781 pp. 


: United Nations publication. 644 pp. 
: Cloth $7.50, paper $6.00. Cloth $8.50, paper $7.00. 
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METHODS OF COLLECTING 
CURRENT AGRICULTURAL 
STATISTICS 


R. D. NARAIN. This manual brings together the methods followed in Europe, North 
and Central America, South America, Asia, Africa, and Oceania for collecting current 
agricultural statistics. The information for each country includes administrative divi- 
sion, agricultural statistics collected, collection of data, sampling methcds, collection 
and tabulation, time schedule, processing of data and publications. Issued in looseleaf 
form, this volume facilitates revisions as methods improve and change. A Food and 
Agriculture Organization publication. 294 pp. Boards $3.00. 











HANDBOOK OF VITAL STATISTICS 
METHODS 


Comprised of a world-wide cross-section of actual practices, procedures, and methods, 
this volume discusses the evolution and present status of vital records and statistics, 
governmental provision, registration, reporting process and the statistical report, com- 
pilation, and tabulation. Tables, index, and a selected list of references are included. 
A United Nations publication. 258 pp. Paper $2.50. 














LV HUUONAEUBULE 





International Documents Service 
COLUMBIA UNIVERSITY PRESS 
New York 27, New York 


PU UL OM 


SHULGOUUDEURE NEEL UDOASMESLOH EEN 


smvitenutuaitt 


VERYUAOGNLADOAPEAEAAL, SAMbELELUNHOUASD OAC POA OOM AEEDONLAN NEARLY CHENG bcio oA DU A 





Please mention the Journel of the Amenican Statistica, Association in writing advertisers 





ALBANY 
AUSTIN 
Boston 


BurraLo-NIAGARA 


CENTRAL INDIANA 


CunTRAL New JERSEY 


CHICAGO 
CLEVELAND 


CoLumMBuUS 
CoNNECTICUT 
DaYTon 
DENVER 
Derroit 
Hawa 
ILLINOIS 
ITHACA 

Los ANGELES 
MILWAUKEE 


MONTREAL 


New ORLBANS 
New Yorxe 


Nortu CAROLINA 


Norts Texas 
OKLAHOMA CITY 


PHILADELPHIA 
PitTsBURGH 
Puerto Rico 
Rocuester, N.Y. 


SACRAMENTO 
San FRANCISCO 


SEATTLE 


Strate Co.iece, Pa. 


Sr. Louis 
TULSA 
VIRGINIA 


WasuincrTon, D.C. 


CHAPTER PRESIDENTS 


Abbott 8. Weinstein, 18-B Old Hickory Drive, Albany 4, New York 

John H. Hargrove, 2005 Raleigh, Austin, Texas 

John E. Alman, Office of Statistical & Research Serv., Boston 
University, 785 Commonwealth Avenue, Boston 15, Massa- 
chusetts 

Robert Mirsky, Cornell Aeronautical Laboratory, 4455 Genessee 
Street, Buffalo 21, New York 

Edgar P. King, Zl Lilly and Company, Indianapolis 6, Indiana 

John Q. Stewart, Princeton University Observatory, 14 Prospect 
Avenue, Princeton, New Jersey 

Adolph O. Berger, 244 Hazel Street, Glencoe, Illinois 

Clark E. Zimmerman, McCann-Erickson, Inc., 1300 National 
City Bank Bldg., Cleveland 14, Ohio 

Merriss Cornell, School of Social Adm., Ohio State University, 
Columbus 10, Ohio 

James Tobin, Yale University, New Haven, Connecticut 

H. Leon Harter, 5335 B. Cobb Drive, Dayton 3, Ohio 

Roland A. Mandat, Coates, Herfurth, and England, Consulting 
Actuaries, 628 Majestic Building, Denver, Colorado 

Wallace W. Gardner, School of Business Administration, Uni- 
versity of Michigan, Ann Arbor, Michigan 

Albert L. Tester, Dept. of Zoology, University uf Hawaii, Honolulu, 
Hawati 

Vincent I. West, Department of Agricultural Economics, University 
of IUinois, Urbana, Illinois 

C. R. Henderson, Department of Husbandry, Cornell University, 
Ithaca, New York 

Hugh H. Brown, California Taxpayers Association, 750 Pacific 
Electric Building, Los Angeles 14, California 

William A. Golomski, Instructor in Mathematics, Marquette Uni- 
versity, Milwaukee, Wisconsin 

Earl F. Beach, 1020 Pine Avenue West, Montreal 2, Quebec, 
Canada 

Roland Pertuit, 4871 Metropolitan Drive, New Orleans, Louisiana 

Robert E. Johnson, Western Electric Co., 195 Broadway, New 
York 7, N.Y. 

Gertrude M. Cox, Institute of Statistics, Box 5457, State College 
Station, Raleigh, North Carolina 

Albert W. Wortham, 3919 Pyka Drive, Dallas, Texas 

Richard W. Poole, Oklahoma City Chamber of Commerce, Skirvin 
Towers Hotel, Oklahoma City, Oklahoma 

Hyman Menduke, 1517 East Mt. Pleasant Avenue, Philadelphia 
38, Pennsylvania 

Donovan J. Thompson, Graduate School of Public Health, Univer- 
sity of Pittsburgh, Pittsburgh 17, Pennsylvania 

Luz M. Torruellas, Puerto Rican Ecanomic Association, P.O. Box 
2003, University Station, Rio Piedras, Puerto Rico 

S. Lee Crump, Atomic Energy Project, P.O. Box 287, Station 3, 
Rochester 20, New York 

Richard D. Morgan, 2748 6th Avenue, Sacramento, California 

Helen Nelson, Div. of Labor Statistics & Research, Calif. Dept. of 
Industrial Relations, P.O. Box 965, San Francisco, California 

Grant I. Butterbaugh, 6815 20th Avenue, N.E., Seattle 5, Wash- 
ington 

James B. Bartoo, Pennsylvania State College, State College, Penn- 
sylvania 

Arthur C. Meyers, Jr., 3674 Lindell, St. Louis 8, Missouri 

Robert Spears, Oklahoma A & M College, Stillwater, Oklahoma 

John E. Freund, Virginia Polytechnic Institute, Dept. of Statistics, 
Blacksburg, Virginia 

Rexford C. Parmelee, 4700 47th Street, N. W., Washington 16, 
D.C. 





ALBANY 


AvsTINn 
Boson 


Curcaco 
CLEVELAND 


CouvumBus 
ConnzcticutT 


~ Charles , Congdon, rabbis z 
: ‘await, H. 
: br University of 


oe adage | bees Hk fits Nihoet of Lotesnttion 
oa ge Tthaca, New York 
omagecag arom dagys bo ed 


sooo, Pulp and Paper Kesearch Inatitde 
ty St.,  Hontreal # Canada 
New OR.Eans t 


New Yorr 
Norts CARouina 


Norts Texas 


Oxianoms Crrr 
PHILADELPHIA 


Prrresuscn 
Pusrro Rico 


Rocugstzr, N.Y. Jack Karger, 210 East Hickory St., East Rochester, New York 

BACRAMENTO Maurice K. Strants, 8761 El Ricon, Sacramento $1, California ; 

San Francisco Miss Phillis Beattie, U.S. Bureau of Labor Statistics, 680 Sansome 
Street, Room 808, San Francisco 11, California. 

Szarriz Clyde Courtnage, Accounting Department, Prederich and Nelson 
5th at Pine, Seattle, Washington 

Starz Couuzce, Pa. George E. Brandow, $12 Bast Mitchell Ave., State College, Penn- 


sylvanta 

Sr. Lours George Little, c/o Southwestern Bell Telephone Co., 1010. Pine St., 
St. Louis 1, Missouri 

Tusa Milton F. Searl, Stinolind Oil and Gas Company, P.O. Bex 591, 
Tulsa, Oklahoma 

Vincinta Clydu Y. Kramer, ~— of Statistics, Virginia Polytochnte Inat., 
Blacksburg, Virgin 

Wasuineron, D.C. Dorothy M, Gilford Statistica Branch, Office of Naval Research, 
Washington 25, D.C. 





- MANAGERIAL STATISTICS 
by KERMIT ©. HANSON, University of Washington 


Simply and logically this text explains the scope, techniques and 
potential benefits of statistical methods in sales forecasting for 
administrative planning and control. The material has been 
clase-tested for several years prior to publication. 


Chapter endings contain questions and exercises; there is a 
wealth of illustrative material drawn from business reports. 
The book is simple and practical, excluding statistical techniques 
with limited or no practical application to management problems. 


306 pages 516” x 814” Published 1955 


MODERN ELEMENTARY STATISTICS 


by JOHN £. FREUND, Allred University 


This widely adopted text presents modern statistical techniques 
and tools at the beginner's level. Emphasis is on the logical 
principles which underlie modern statistical techniques theory 
Sume of the outstanding features: a special chapter on the nature 
of scientific predictions, 108 illustrations and exercises covering 
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